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(54) Video optimized media streamer with cache management 

(57) A data storage system includes a mass storage 
unit storing a data entity, such as a digital representation 
of a video presentation, that is partitioned into a plurality 
N of temporally-ordered segments. A data buffer is bldi- 
rectfonally coupled to the mass storage unit for storing 
up to M of the temporally-ordered segments, wherein M 
is less than N. The data buffer has an output for output- 
ting stored ones of the temporally-ordered segments. 
The data storage system further includes a data buffer 
manager for scheduling transfers of individual ones of 
the temporally-ordered segments between the mass 
storage unit and the data buffer The data buffer manager 
schedules the transfers in accordance with at least a pre- 
dicted time that an individual one of the temporally-or- 
dered segments will be required to be output from the 
data buffer. When emptoyed with a media streamer (10) 
distributed data buffer management techniques are em- 
ployed for selecting blocks to be retained in a buffer 
memory, either in a storage node (16, 17) or in a com- 
munication node (14). These techniques rely on the pre- 
dictable nature of the video data stream, and thus are 
enabled to predict the future requirements for a given 
one of the data blocks. 
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Description 

Field of the Invention 

s This invention relates to a system for delivery of multimedia data and, more particularly, an interactive video sender 

system that provides video simultaneously to a plurality of terminals with minimal buffering. 

Background of the invention 

10 The playing of movies and video is today accomplished with rather old technology. The primary storage media is 

analog tape, such as VHS recorders/players, and extends up to the very high quality and very expensive D1 VTR's used 
by television studios and broadcasters. There are many problems with this technology. A few such problems include: 
the manual labour required to load the tapes, the wear and tear on the mechanical units, tape head, and the tape itself, 
and also the expense. One significant limitation that troubles Broadcast Statbns is that the VTRs can only perform one 

IS f unctbn at a time, sequentially. Each tape unit costs from $75,000 to $1 50,000. 

TV stations want to increase their revenues from commercials, which are nothing more than short movies, by in- 
serting special commercials into their standard programs and thereby targeting each city as a separate market. This is 
a difficult task with tape technology, even with the very expensive Digital D1 tape systems or tape robots. 

Traditional methods of delivery of multimedia data to end users fall into two categories: 1 ) broadcast industry meth- 

20 ods and 2) computer industry methods. Broadcast methods (including motton picture, cable, television network, and 
record Industries) generally provide storage In the form of analog or digitally recorded tape. The playing of tapes causes 
isochronous data streams to be generated which are then moved through broadcast industry equipment to the end user. 
Computer methods generally provide storage in the form of disks, or disks augmented with tape, and record data in 
compressed digital formats such as D Vt. JPEG and MPEG. On request, computers deliver non-isochronous data streams 

2S to the end user, where hardware buffers and special application code smooths the data streams to enable continuous 
viewing or listening. 

Video tape subsystems have traditionally exhibited a cost advantage over computer disk subsystems due to the 
cost of the storage media. However, video tape subsystems have the disadvantages of tape management, access 
latency, and relatively low reliability. These disadvantages are increasingly significant as computer storage costs have 

30 dropped, in combinatk)n with the advent of the real-time digital compressfon/decompression techniques. 

Though computer subsystems have exhibited compounding cost/performance improvements, they are not generally 
considered to be "video friendly". Computers interface primarily to workstations and other computer terminals with in- 
terfaces and protocols that are termed "non-isochronous". To assure smooth (isochronous) delivery of multimedia data 
to the end user, computer systems require special applbation code and large buffers to overcome inherent weaknesses 

35 In their traditk)nal communbation methods. AlsOi computers are not video friendly in-that they lack compatible interfaces - ..^.-p^.Jt. « 
to equipment in the multimedia industry whbh handle isochronous data streams and switch among them with a high 
degree of accuracy. 

With the introduction of the use of computers to compress and store video material in digital format, a revolutbn 
has begun in several major industries such as television broadcasting, movie studio production, "Video on Demand" 
40 over telephone lines, pay-per-view movies in hotels, etc. Compression technology has progressed to the point where 
acceptable results can be achieved with compressbn ratios of 100x to 180x. Such compressbn ratios make random 
access disk technobgy an attractive altematrve to prior art tape systems. 

With an ability to random access digital disk data and the very high bandwidth of disk systems, the required system 
function and performance is within the performance, hardware cost, and expendability of disk technology. In the past, 
^ the use of disk files to store vbeo or rrKwies was never really a oonsberatbn because of the cost of storage. That cost 
has seen significant reductions in the recent past. 

For the many new emerging markets that utilize compressed video data, using MPEG standards, there are several 
ways in which video data can be stored in a cost effective manner. This invention provides a hierarchical solution to 
many different perfomriance requirements and results in a modular systems approach that can be customized to meet 
so market requirements. 

Summarv of the Invention 

The invention provides a "video friendly' computer subsystem which enables isochronous data stream delivery in 
ss a multimedia environment over traditbnal interfaces for tliat industry. A media streamer in accordance with the inventbn 
is optimized for the delivery of isochronous data streams and can stream data into new computer networks with ATM 
(Asynchronous Transfer Mode) technobgy. This inventbn eliminates the disadvantages of video tape while providing 
a VTR (vWeo tape recorder) metaphor for system control. The system of this inventbn provides the folbwing features: 
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scalability to deliver from 1 to 1000's of independently controlled data streams to end users; an ability to deliver many 
isochronous data streams from a single copy of data; mixed output interfaces; mixed data rates; a simple "open system' 
control interface; automation control support; storage hierarchy support; and low cost per delivered stream. 

In accordance with an aspect of this invention a data storage system includes a mass storage unit storing a data 
s entity, such as a digital representation of a video presentation, that Is partitioned into a plurality N of temporally-ordered 
segments. A data buffer is bidirectionally coupled to the mass storage unit for storing up to M of the temporally-ordered 
segments, wherein M Is less than N. The data buffer has an output for outputting stored ones of the temporally-ordered 
segments. The data storage system further includes a data buffer manager for scheduling transfers of individual ones 
of the temporally-ordered segments between the mass storage unit and the data buffer. The data buffer manager sched- 
ules the transfers in accordance with at least a predicted time that an individual one of the temporally-ordered segments 
will be required to be output from the data buffer. 

Further in accordance with this invention there is provided a media streamer having at least one storage node for 
storing a digital representation of at least one video presentation. The at least one video presentation requires a time T 
to present in its entirety, and is stored as a plurality of N data blocks. Each data block is a T/N portion of the at least one 

IS video presentation. The at least one storage node includes a first data buffer for buffering at least one of the N data 
blocks. The media streamer further includes a plurality of communk:atk>n nodes each having an input port that is coupled 
via a circuit switch to an output of the first data buffer for sequentially receiving a plurality of the N data blocks therefrom. 
The sequentially received N data blocks are associated with a same video presentation or with different video presen- 
tations. Each of the plurality of communication nodes further have a plurality of output ports, wherein individual ones of 

^ the plurality of output ports output a digital representatbn of one video presentatbn. Individual ones of the plurality of 
communication nodes further include a second data buffer for buffering at least one of the N data blocks prior to outputting 
the at least one of the N data blocks. Ihe media streamer further Includes at least one control node responsive to a first 
operating condition for causing transfer of one of the N data bkjcks from the first data buffer to an output port of a first 
communication node and also to an output port of a second communication node, the at least one control node being 

2S further responsive to a second operating conditk>n for causing transfer of one of the N data bkx;ks from the first data 
buffer to the second data buffer of one of the communicatbn nodes, and for causing transfer of the one of the N data 
blocks from the second data buffer to a plurality of the output ports of the one of the communbation nodes. 

EmtKxJiments are disclosed of presently preferred distributed data buffer management techniques for selecting 
blocks to be retained In a buffer memory, either in a storage node or in a communicatbn node. These techniques rely 

30 on the predictable nature of the vbeo data stream, and thus are enabled to predict the future requirements for a given 
one of the data bbcks. 

Brief Description of the Drawings 

. »v..^...M^ ''»^^''*;»r.«». ,,.,TheinventbnwWillnow>bedescribedrby.wayof example 

Fig. 1 is a block diagram of a media streamer incorporating the inventbn hereof; 

Fig. 1 A is a block diagram whbh illustrates further details of a circuit switch shown in Fig. 1 ; 

Fig. 1 B is a block diagram which illustrates further details of a tape storage node shown in Fig. 1 ; 

Fig. 1C is a bbck diagram whbh illustrates further details of a disk storage node shown In Fig. 1 ; 

Fig. 1 D is a bbck diagram whbh illustrates further details of a communbation node shown in Fig. 1 ; 

Fig. 2 Illustrates a list of vkleo stream output control commands which are executed at high priority and a further 
list of data management commands whbh are executed at bwer priority; 

Fig. 3 is a block diagram illustrating communbation node data flow; 

Fig. 4 is a block diagram illustrating disk storage node data flow; 

Fig. 5 illustrates control message fbw to enable a connect to be accomplished; 

Fig. 6 illustrates control message flow to enable a play to occur; 

Fig. 7 illustrates interfaces whbh exist between the media streamer and client control systems; 
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Fig. 8 illustrates a display, panel showing a plurality of "soft' keys used to operate the media streamer; 

Fig. 9 illustrates a load selection panel that Is displayed upon selection of the load soft key on Fig. 8; 

5 Fig. 10 Illustrates a batch selection panel that is displayed when the batch key In Fig. 8 Is selected; 

Fig. 11 Illustrates several dlent/sen^er relatbnships whk:h exist between a client control system and the media 
streamer; 

10 Fig. 12 illustrates a prbr art technique for accessing video data and feeding It to one or more output ports; 

Fig. 13 Is a bkx:k diagram indicating how plural video ports can access a single video segment contained in a 
communications node cache memory; 

IS Fig. 14 is a block diagram illustrating how plural vkieo ports have direct access to a vkieo segment contained In 

cache memory on the disk storage node; 

Fig. 15 illustrates a memory allocatk)n scheme emptoyed by the invention hereof; 
20 Fig. 1 6 illustrates a segmented logrcal file for a video 1 ; 

Fig. 17 illustrates how the varbus segments of video 1 are striped across a plurality of disk drives; 
Fig. IB illustrates a prbr art switch interface between a storage node and a cross bar switch; 
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Fig. 1 9 illustrates howthe prior art switch interface shown in Fig. 18 is modified to provide extended output bandwidth 
for a storage node; 

Fig. 20 is a block diagram illustrating a procedure for assuring constant video output to a video output bus; 

Fig. 21 illustrates a block diagram of a video adapter used in converting digital video data to anabg vkieo data; and 

Fig. 22 is a block diagram showing control modules that enable SCSI bus commands to be empbyed to control the 
video adapter card of Fig. 21 . 

Detailed Description of the Invention 



GLOSSARY 

^ In the following descriptbn, a number of terms are used that are described below: 

AAL-5 ATM ADAPTATION LAYER-5: Refers to a class of ATM service suitable for data transmissbn. 

ATM ASYNCRHONOUS TRANSFER MODE: A high speed switching and transport technology that 

^ can be used In a local or wide area network, or both. It is designed to carry both data and 

vbeo/audb. 

Betacam A professional quality analog vbeo format. 

so CCIR 601 A standard resolution for digital television. 720 x 840 (for NTSC) or 720 x 576 (for PAL) lumi- 

nance, with chrominance subsampled 2:1 horizontally. 

CPU CENTRAL PROCESSING UNIT In computer architecture, the main entity that processes com- 

puter instructions. 

55 

CRC CYCLIC REDUNDANCY CHECK. A data error detection scheme. 

D1 Digital Video recording format conforming to CCIR 601 . Records on 1 9mm video tape. 
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Digital video recording format conforming to SMPTE 244M. Records on 19mm video tape. 

Digital Video recording fonmat conforming to SMPTE 244M. Records on 1/2" video tape. 

Dl RECT ACCESS STORAGE DEVI CE: Any on-line data storage device or CD-ROM player that 
can be addressed is a DASD. Used synonymously with magnetic disk drive. 

DIRECT MEMORY ACCESS: A method of moving data in a computer architecture that does 
not require the CPU to move the data. 

A relatively low quality digital video compression format usually used to play video from CD-ROM 
disks to computer screens. 

European equivalent of T1 . 

FIRST IN FIRST OUT: Queue handling method that operates on a first-come, first-served basis: 

Refers to a process of synchronization to another video signal. It is required in computer capture 
of video to synchronize the digitizing process with the scanning parameters of the video signal. 

INPUT/OUTPUT 

Used to describe information that is time sensitive and that is sent (preferably) without interrup- 
tions. Video and audio data sent In real time are isochronous. 

JOINT PHOTOGRAPHIC EXPERT GROUP: A working committee under the auspices of the 
International Standards Organization that Is defining a proposed universal standard for digital 
compressbn of still images for use In computer systems. 

KILO BYTES: 1024 bytes. 

LOCAL AREA NETWORK: High-speed transmission over twisted pair, coax, or fibre optic cables 
that connect terminals, computers and peripherals together at distances of about a mile or less. 

LEAST RECENTLY USED 

MOVING PICTURE EXPEf=n"S GROUP: A working committee under the auspices of the Inter- 
national Standards Organization that is defining standards for the digital compression/decom- 
pression of motion video/audio. MPEG-1 is the initial standard and is in use. MPEG-2 will be 
the next standard and will support digital, flexible, scalable vkieo transport. It will cover multiple 
resolutions, bit rates and delivery mechanisms. 

See MPEG 

MOST RECENTLY USED 
MOST TIME TO NEXT USE 

NATIONAL TELEVISION STANDARDS COMMITTEE: The colour television fomiat that is the 
standard In the United States and Japan. 

PHASE ALTERNATION LINE: The cobur television fomiat that Is the standard for Europe except 
for France. 

PERSONAL COMPUTER: A relatively low cost computer that can be used for home or business. 

REDUNDANTARRAY of INEXPENSIVE DISKS: A storage arrangement that uses several mag- 
netic or optical disks working In tandem to Increase bandwkith output and to provide redundant 
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backup. 

SCSI SMALL COMPUTER SYSTEM INTERFACE: An industry standard for connecting peripheral 

devices and their controllers to a computer. 

SIF SOURCE INPUT FORMAT: One quarter the CCIR 601 resolution, 

SMPTE SOCIETY OF MOTION PICTURE & TELEVISION ENGINEERS. 

10 SSA SERIAL STORAGE ARCHITECTURE: A standard for connecting peripheral devices and their 

controllers to computers. A possible replacement for SCSI. 

T1 Digital interface into the telephone network with a bit rate of 1 .544 Mb/sec. 

IS TCP/IP TRANSMISSION CONTROL PROTOCOL/! NTERNET PROGRAM: A set of protocols devel- 

oped by the Department of Defense to link dissimilar computers across networks. 

VIHS VERTICAL HELICAL SCAN: A common format for recording analog video on magnetk: tape. 

20 VTR VIDEO TAPE RECORDER: A device for recording video on magnetk: tape. 

VCR VIDEO CASSETTE RECORDER: Same as VTR. 

A. GENERAL ARCHITECTURE 

2S 

A vkjeo optimized stream sender system 1 0 (hereafter referred to as media streamer) is shown in Fig. 1 0 and includes 
four architecturally distinct components to provide scalability, high availability and configuration flexibility. The major 
components folbw: 

30 1) Low Latency Switch 12: a hardware/microcode component with a primary task of delivering data and control 

infonnnation between Communicatkwi Nodes 14, one or more Storage Nodes 16, 17 and one or more Control Nodes 
18. 

2) Communication Node 14: a hardware/microcode component with the primary task of enabling the "playing" 

35 . ■ >^^>(delivering datavisochronously) or "recording" (receiving data isochronously) over an ^extemally ■defined^interface> i .v«.fT. 
usually familiar to the broadcast industry: NTSC, PAL, D1, D2, etc. The digital-to-video interface is embodied in a 
vkieo card contained In a plurality of vkleo ports 15 connected at the output of each communicatbn node 14. 

3) Storage Node 16, 17: a hardware/microcode component with the primary task of managing a storage medium 
^ such as disk and associated storage availability optk)ns. 

4) Control Node 18: a hardware/mbrocode component with the primary task of receiving and executing control 
commands from an extemally defined subsystem interface familiar to the computer Industry. 

^ A typical media streamer with 64 nodes implementation might contain 31 communication nodes, 31 storage nodes, 

2 control nodes interconnected with the bw latency switch 12. A smaller system might contain no switch and a single 
hardware node that supports communicatkxis, storage and control functbns. The design of media streamer 10 alk>ws 
a small system to grow to a large system in the customer installation. In all configurations, the functional capability of 
media streamer 10 can remain the same except for the number of streams delivered and the number of multimedia 

50 hours stored. 

In Fig. 1A, further details of low latency switch 12 are shown. A plurality of circuit switch chips (not shown) are 
interconnected on crossbar switch cards 20 which are interconnected via a planar board (schematically shown). The 
planar and a single card 20 constitute a low latency crossbar switch with 16 node ports. Additional cards 20 may be 
added to configure additional node ports and, if desired, active redundant node ports for high availability. Each port of 
ss the low latency switch 12 enables, by example, a 25 megabyte per second, full duplex communication channel. 

Information is transferred through the switch 12 in packets. Each packet contains a header portion that controls the 
switching state of indivkJual crosst)ar switch points in each of the switch chips. The control node 18 provides the other 
nodes (storage nodes 16, 17 and communication nodes 14) with the information necessary to enable peer-to-peer 
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operation via the low latency switch 12. 

In Fig. IB, internal details of a tape storage node 17 are illustrated. As will be hereafter understood, tape storage 
node 1 7 provides a high capacity storage facility for storage of digital representations of video presentations. 

As employed herein a video presentation can include one or more images that are suitable for display and/or process- 

s ing. A video presentation may include an audio portion. The one or more images may be logically related, such as 
sequential frames of a film, movie, or animation sequence. The images may originally be generated by a camera, by a 
digital computer, or by a combination of a camera and a digital computer The audio portion may be synchronized with 
the display of successive images. As employed herein a data representation of a video presentation can be any suitable 
digital data format for representing one or more images and possibly audio. The digital data may be encoded and/or 

10 compressed. 

Referring again to Fig. IB a tape storage node 17 includes a tape library controller interface 24 which enables 
access to multiple tape records contained in a tape library 26. A further interface 28 enables access to other tape libraries 
via an SCSI bus interconnection. An internal system memory 30 enables a buffering of video data received from either 
of interfaces 24 or 28, or via DMA data transfer path 32. System memory block 30 may be a portion of a PC 34 which 
IS Includes software 36 for tape library and file management actions. A switch interface and buffer module 38 (used also 
In disk storage nodes 16, communicatbn nodes 14, and control nodes 18) enables interconnection between the tape 
storage node 17 and bw latency switch 12. That is, the module 38 is responsible for partitioning a data transfer into 
packets and adding the header portion to each packet that the switch 12 employs to route the packet. When receiving 
a packet from the switch 12 the module 38 is responsible for stripping off the header portbn before locally buffering or 
20 otherwise handling the received data. 

Video data from tape library 26 Is entered Into system memory 30 in a first buffering action. Next, In response to 
Initial directk>n from control node 18, the video data is routed through low latency switch 12 to a disk storage node 16 
to be made ready for substantially immediate access when needed. 

1 n Fig. 1 C. internal details of a disk storage node 1 6 are shown. Each disk storage node 1 6 includes a switch interface 
25 and buffer module 40 which enables data to be transferred from/to a RAID buffer video cache and storage interface 
module 42. Interface 42 passes received video data onto a plurality of disks 45, spreading the data across the disks in 
a quasl-RAID fashion. Details of RAID memory storage are known in the prior art and are described In "A Case for 
Redundant Arrays of Inexpensive Disks (RAID)'. Patterson et al., ACM SIGMOD Conference, Chk:ago. IL, June 1>3, 
1988 pages 109-116. 

30 A disk storage node 1 6 further has an internal PC 44 which includes software modules 46 and 48 which, respectively, 

provide storage node control, video file and disk control, and RAID mapping for data stored on disks 45. In essence, 
each disk storage node 1 6 provkles a more Immediate level of vkieo data availability than a tape storage node 1 7. Each 
disk storage node 16 further is enabled to buffer (in a cache manner) video data in a semiconductor memory of switch 
interface and buffer module 40 so as to provide even faster availability of video data, upon receiving a request therefor, 
» . u-^35.,«v.r ... . In general,>a'Storage node includes^a mass storage unit (or an interface to a'rTiass stoFage'Unit) and'a^capability' tot. ^u^ifi>i' '- -i^^^^ 
kx^ally buffer data read from or to be written to the mass storage unit. The storage node may include sequential access ^ 
mass storage in the form of one or more tape drives and/or disk drives, and may include random access storage, such 
as one or more disk drives accessed in a random access fashion and/or semiconductor memory. 

In Fig. 1D, a block diagram is shown of internal components of a communications node 14. Similar to each of the 

^ above noted nodes, communicatbn node 1 4 includes a switch interface and buffer module 50 which enables commu- 
nications with low latency switch 1 2 as described prevbusly. Video data is directly transferred between switch interface 
and buffer module 50 to a stream buffer and communication Interface 52 for transfer to a user terminal (not shown). A 
PC 54 includes software modules 56 and 58 which provide, respectively, communication node control (e.g.. stream 
start/stop actions) and enable the subsequent generation of an isochronous stream of data. An additional input 60 to 

^ stream buffer and communication interface 52 enables frame synchronization of output data. That data is received from 
automation control equipment 62 which is. In turn, controlled by a system controller 64 that exerts overall operational 
control of the stream sen/er 10 (see Fig. 1 ). System controller 64 responds to inputs from user control set top boxes 65 
to cause comnr^ands to be generated that enable media streamer 10 to access a requested video presentation. System 
controller 64 is further provided with a user interface and display facility 66 which enables a user to input commands, 

SO such as by hard or soft buttons, and other data to enable an identification of video presentatbns, the scheduling of video 
presentations, and control over the playing of a video presentation. 

Each control node 18 is configured as a PC and Includes a switch interface module for Interfacing with low latency 
switch 1 2. Each control node 1 8 responds to inputs from system controller 64 to provide information to the communication 
nodes 14 and storage nodes 16, 17 to enable desired interconnections to be created via the low latency switch 12. 

ss Furthermore, control node 18 includes software for enabling staging of requested video data from one or more of disk 
storage nodes 16 and the delivery of the video data, via a stream delivery interface, to a user display terminal. Control 
node 18 further controls the operatnn of both tape and disk storage nodes 16, 17 via commands sent through tow 
latency switch 12. 
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The media streamer has three architected extemal interfaces, shown in Fig. 1 . The external interfaces are: 

1) Control Interface: an open system interface executing TCP/IP protocol (Ethernet LAN, TokenRing LAN. serial 
port, modem, etc.) 

5 

2) Stream Delivery Interface: one of several industry standard interfaces designed for the delivery of data streams 
(NTSC, Dl.etc). 

3) Automation Control Interface: a collection of industry standard control interfaces for precise synchronization of 
10 stream outputs (GenLock. BlackBurst, SMPTE clock, etc.) 

AppIlcatkMi commands are issued to media streamer 1 0 over the control interface. When data load commands are 
issued, the control node breaks the incoming data file into segments (i.e. data blocks) and spreads it across one or more 
storage nodes. Material density and the number of simultaneous users of the data affect the placement of the data on 
IS storage nodes 1 6, 1 7. Increasing density and/or simultaneous users implies the use of more storage nodes for capacity 
and bandwkith. 

When commands are issued over the control interface to start the streaming of data to an end user, control node 
18 selects and activates an appropriate communicatbn node 14 and passes control information indicating to it the 
kx:atk)n of the data file segments on the storage nodes 16, 17. The communications node 14 activates the storage 

20 nodes 16, 17 that need to be involved and proceeds to communicate with these nodes, via command packets sent 
through the low latency switch 12, to begin the movement of data. 

Data is moved between disk storage nodes 16 and communication nodes 14 via low latency switch 12 and "just in 
time" scheduling algorithms. The technique used for scheduling and data flow control is more fully described below. The 
data stream that is emitted from a communk:ation node interface 14 is multiplexed tc/from disk storage nodes 16 so 

25 that a single communication node stream uses a fraction of the capacity and bandwidth of each disk storage node 1 6. 
In this way, many communicatbn nodes 14 may multiplex access to the same or different data on the disk storage nodes 
16. For example, media streamer 10 can provide 1500 individually controlled end user streams from the pool of com- 
munication nodes 14, each of whbh is multiplexing accesses to a single multimedia file spread across the disk storage 
nodes 16. This capability is termed 'single copy multiple stream'. 

30 The commands that are received over the control Interface are executed in two distinct categories. Those which 

manage data and do not relate directly to stream control are executed at "low priority". This enables an application to 
load new data into the media streamer 10 without interfering with the delivery of data streams to end users. The com- 
mands that affect stream delivery (i.e. output) are executed at "high priority". 

The control interface commands are shown in Fig. 2. The low priority data management commands for loading and 
:*.mu,i.^.„..«.«,...3Sr,.t managing data in ^edia Streamer t10 includetVSTGREATEj«VSOPEN, VS-READi VS^WRITE.^-^ 

VS-SET_POSITION. VS-CLOSE, VS-RENAME. VS-DELETE GET_ATTRIBUTES. and VS-GET_NAMES. 

The high priority stream control commands for starting and managing stream outputs include VS-CONNECT, 
VS-PLAY VS-RECORD, VS-SEEK, VS-PAUSE, VS-STOP and VS-DISCONNECT Control node 18 monitors stream 
control commands to assure that requests can be executed. This "admission control" facility in control node 18 may 

40 reject requests to start streams when the capabilities of media streamer 10 are exceeded. This may occur in several 
circumstances: 

1 ) when some component fails in the system that prevents maximal operatkxi; 

45 2) when a specified number of simultaneous streanns to a data file (as specified by parameters of a VS-CREATE 

command) is exceeded; and 

3) when a specified number of simultaneous streams from the system, as specified by an installatbn configuration, 
is exceeded. 

so 

The communication nodes 14 are managed as a heterogeneous group, each with a potentially different bandwidth 
(stream) capability and physical definition. The VS-CONNECT command directs media streamer 10 to allocate a com- 
munication node 1 4 and some or all of its associated bandwidth enabling isochronous data stream delivery. For example, 
media streamer 10 can play uncompressed data stream(s) through communication node(s) 14 at 270 MBits/Sec while 
S5 simultaneously playing compressed data stream(s) at much bwer data rates (usually 1-16 Mbits/Sec) on other commu- 
nication nodes 14. 

Storage nodes 16, 17 are managed as a heterogeneous group, each with a potentially different bandwkfth (stream) 
capability and physbal definition. The VS-CREATE command directs media streamer 10 to allocate storage in one or 
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more storage nodes 1 6. 1 7 for a multimedia file and its associated metadata. The VS-CREATE command specifies both 
the stream density and the maximum number of simultaneous users required. 

Three additional commands support automation control systems In the broadcast industry: VS-CONNECT-LIST, 
VS-PLAY-AT-SIGNAL and VS-RECORD-AT-SIGNAL. VS-CONNECT-LIST allows applications to specify a sequence of 
play commands in a single command to the subsystem. Media streamer 10 will execute each play command as if it were 
issued over the control interface but will transition between the delivery of one stream and the next seamlessly. An 
example sequence follows: 

1 ) Control node 18 receives a VS-CONNECT-LIST command with play subcommands indicating that all or part of 
FILE1 , FILE2 and FILES are to be played in sequence. Control node 18 detemnines the maximum data rate of the 
files and allocates that resource on a conrnmunlcation node 14. The allocated communication node 14 is given the 
detailed play list and initiates the delivery of the isochronous stream. 

2) Near the end of the delivery of FILE1 , the communication node 14 Initiates the delivery of FILE2 but it does not 
enable it to the output port of the node. When FILE1 completes or a signal from the Automation Control Interface 
occurs, the communication node 1 4 switches the output port to the second stream from the first. This is done within 
1/30th of a second or within one standard video frame time. 

3) The communication node 1 4 deallocates resources associated with FILE1. 

VS-PLAY-AT-SIGNAL and VS-RECORD-AT-SIGNAL allow signals from the external Automation Control Interface 
to enable data transfer for play and record operations with accuracy to a video fame boundary, tn the previous example, 
the VS-CONNECT-LIST includes a PLAY-AT-SIGNAL subcommand to enable the transition from FILE1 to FILE2 based 
on the external automation control interface signal. If the subcommand were VS-PLAY instead, the transition would 
occur only when the FILE1 transfer was completed. 

Other commands that media streamer 1 0 executes provide the ability to manage storage hierarchies. These com- 
mands are: VS-DUMP, VS-RESTORE, VS-SEND, VS-RECEIVE and VS-RECEIVE_AND_PLAY Each causes one or 
more multimedia files to move between storage nodes 16 and two externally defined hierarchical entities. 

30 1 ) VS-DUMP and VS-RESTORE enable movement of data between disk storage nodes 1 6, and a tape storage unit 

17 accessible to control node 18. Data movement may be initiated by the controlling application or automatically by 
control node 18. 

2) VS-SEND and VS-RECEIVE provide a method for transmitting a multimedia file to another media streamer, 
r .!. 7 V. r .... .«-mmr.,iOptionally,-the~ receiving^media streamePican-play* the*incomingi.f lie immediatelyrto a«^realloGatediCommunlGation i/(^«r-t.i<w*t=^i.«*» 
node without waiting for the entire file. 

In addition to the nnodular design and function set defined in the media streamer architecture, data flow is optimized 
for isochronous data transfer to significantly reduce cost. In particular: 

40 

1 ) bandwidth of the low latency switch exceeds that of the attached nodes; communications between nodes is nearly 
non-blocking; 

2) data movement into processor memory is avoided, more bandwidth is provided; 

4S 

3) processing of data Is avoided; expensive processing units are eliminated; and 

4) data movement is carefully scheduled so that; large data caches are avokied. 

so In traditbnal computer terms, media streamer 10 functions as a system of interconnected adapters with an ability 

to perform peer-peer data movement between themselves through the low latency switch 12. The low latency switch 
12 has access to data storage and moves data segments from one adapter's memory to that of another without a "host 
computer" inten^entkxi. 

ss B. HIERARCHICAL MANAGEMENT OF DIGITAL COMPRESSED VIDEO DATA FOR ISOCHRONOUS DELIVERY 

Media streamer 10 provides hierarchical storage elements. It exhibits a design that alk>ws scalability from a very 
small video system to a very large system. It also provides a flexibility for storage management to adapt to the varied 
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requirements necessary to satisfy functions of Video on Demand, Near Video on Demand, Commercial Insertion, high 
quality uncompressed video storage, capture and playback. 

B1. TAPE STORAGE 

5 

In media streamer 10, video presentations are moved from high performance digital tape to disk, to be played out 
at the much lower data rate required by the end user In this way, only a minimum amount of video time is stored on the 
disk subsystem. If the system is "Near Video on Demand", then only, by example, 5 minutes of each movie need be In 
disk storage at any one time. This requires only 22 segments of 5 minutes each for a typical 2 hour movie. The result 
10 is that the total disk storage requirement for a video presentatbn is reduced, since not all of the vkJeo presentation Is 
kept on the disk file at any one time. Only that portion of the presentation that Is being played need be present in the 
disk file. 

In other words, if a video presentation requires a time T to present in Its entirety, and is stored as a digital repre- 
sentation having N data bkx:ks, then each data block stores a portion of the video presentatk^n that corresponds to 
IS approximately a T/N perkxJ of the video presentatbn. A last data block of the N data blocks may store less than a T/N 
perkxj. 

As demand on the system grows and the number of streams increases, the statistical average is that about 25% 
of video stream requests will be for the same movie, but at different sub-second time intervals, and the distribution of 
viewers will be such that nnore than 50% of those sub-second demands will fall within a group of 15 nnovie segments. 

20 An aspect cf this Invention Is the utilization of the most appropriate technobgy that will satisfy this demand. A random 

access cartridge bader (such as produced by the IBM Corporation) is a digital tape system that has high storage capacity 
per tape, mechanical robotic loading of 100 tapes per drawer, and up to 2 tape drives per drawer. The result is an effective 
tape library for movie-on-demand systems. However, the invention also enables very low cost digital tape storage library 
systems to provide the mass storage of the movies, and further enables low demand nnovies to be played directly from 

2S tape to speed-matching buffers and then on to video decompression and distribution channels. 

A second advantage of combining hierarchical tape storage to any video system is that it provkJes rapki backup to 
any movie that Is stored on disk, in the event that a disk becomes inoperative. A typical system will nnaintain a "spare' 
disk such that if one disk unit fails, then movies can be rek>aded from tape. This would typbally be combined wrth a 
RAID or a RAID-like system. 



30 



82. DISK STORAGE SYSTEMS 



When demand for video streams increases to a higher level, it becomes more efficient to store an entire movie on 
disk and save the system performance overhead required to continually move video data from tape to disk. A typical 
rai.nvr ^3Si>f <4..system will still.contain^a library.of movieS'thatare^stored on tape, since the usual numbeF'Of.movle&ln the library is^lOX'- 
to lOOx greater than the number that will be playing at any one time. When a user requests a specific movie, segments 
of it are k)aded to a disk storage node 1 6 and started from there. 

When there are large numbers of users wanting to see the same movie, it is beneficial to keep the movie on disk. 
These movies are typically the "Hot" movies of the current week and are pre-loaded from tape to disk prior to peak 
^ viewing hours. This tends to reduce the work bad on the ^stem during peak hours. 

83. MOVIES OUT OF CACHE 

As demand for "hot" movies grows, media streamer 10, through an MRU-based algorithm, decides to move key 
^ movies up into cache. This requires substantial cache memory, but in terms of the ratio of cost to the number of active 
streams, the high volume that can be supported out of cache bwers the total cost of the media streamer 10. 

Because of the nature of video data, and the fact that the system always knows In advance what videos are playing 
and what data will be required next, and for how long, methods are employed to optimize the use of cache. Internal 
buffers, disk storage, the tape loader, bus perfornnance, etc. 
^ Algorithms that control the placement and distribution of the content across all of the storage media enable delivery 

of isochronous data to a wkle spectrum of bandwkfth requirements. Because the delivery of Isochronous data is sub- 
stantially 100% predictable, the algorithms are very much different from the traditional ones used for other segments of 
the computer Industry where caching of user-accessed data Is not always predbtable. 

SS C. MEDIA STREAMER DATA FLOW ARCHITECTURE 

As indicated above, media streamer 1 0 delivers video streams to varbus outputs such as TV sets and set top boxes 
attached via a networic, such as a LAN, ATM, etc. To meet the requirements for storage capacity and the number of 



10 



EP 0 702 491 A1 



simultaneous streams, a distributed architecture consisting of multiple storage and communication nodes is preferred. 
The data Is stored on storage nodes 1 6, 1 7 and Is delivered by communication nodes. A commun Ication node 1 4 obtains 
the data from appropriate storage nodes 16, 17. The control node 18 provides a single system innage to the external 
world. The nodes are connected by the cross-connect, low latency switch 12. 
s Data rates and the data to be delivered Is predictable for each stream. The invention makes use of this predictability 

to construct a data flow architecture that makes full use of resources arKi which insures that the data for each stream 
is available at every stage when it Is needed. 

Data flow between the storage nodes 1 6, 1 7 and the communication nodes 1 4 can be set up in a number of different 
ways. 

10 A communk:ation node 1 4 Is generally responsible for delivering multiple streams. It may have requests outstanding 
for data for each of these streams, and the required data may come from different storage nodes 16,17. If different 
storage nodes were to attempt, simultaneously, to send data to the same communication node, only one storage node 
would be able to send the data, and the other storage nodes would be blocked. The blockage would cause these storage 
nodes to retry sending the data, degrading switch utilizatk>n and introducing a large variance In the time required to 

IS send data from a storage node to the communbatbn node. In this inventbn, there is no contention for an input port of 
a oommunicatkxi node 14 among different storage nodes 16, 17. 

The anrK)unt of required buffering can be determined as follows: the communication node 14 determines the mean 
time requ ired to send a request to the storage node 16.17 and receive the data. This time Is determined by adding the 
time to send a request to the storage node and the time to receive the response, to the time needed by the storage node 

20 to process the request. The storage node in turn determines the mean time required to process the request by adding 
the mean time required to read the data from disk and any delays involved in processing the request. This is the latency 
in processing the request. The amount of buffering required is the memory storage needed at the stream data rate to 
cover the latency. The solution described below takes advantage of special conditions in the media streamer environment 
to reduce latency and hence to reduce the resources required. The latency is reduced by using a just-in-time scheduling 

2S algorithm at every stage of the data (e.g., within storage nodes and communications nodes), in conjunctton with antic- 
ipating requests for data from the prevbus stage. 

Contentbn by the storage nodes 16, 17 for the input port of a communication node 14 Is eliminated by employing 
the following two criterion: 

30 1 ) A storage node 1 6. 1 7 only sends data to a communlcatton node 1 4 on receipt of a specif k: request. 



2) A given communicatk>n node 14 serializes all requests for data to be read from storage nodes so that only one 
request tor receiving data from the communication node 14 is outstanding at any time, Independent of the number 
of streams the communication node 14 is delivering. 

As was noted above, the reduction of latency relies on a just-in-time scheduling algorithm at every stage. The bask: 
principle is that at every stage in the data f k>w for a stream, the data is available when the request for that data arrives. 
This reduces latency to the time needed for sending the request and performing any data transfer. Thus, when the 
control node 18 sends a request to the storage node 16 for data for a specific stream, the storage node 16 can respond 
40 to the request almost immediately. This characteristk: is important to the solution to the contentk>n problem described 
above. 

Since, in the media streamer environment, access to data is sequential and the data rate for a stream is predictable, 
a storage node 16 can anticipate when a next request for data for a specific stream can be expected. The identity of 
the data to be supplied in response to the request is also known. The storage node 16 also knows where the data is 

^ stored and the expected requests for the other streams. Given this information and the expected time to process a read 
request from a disk, the storage node 1 6 schedules a read operation so that the data is available just before the request 
from the communication node 14 arrives. For example. If the stream data rate is 250KB/sec, and a storage node 16 
contains every 4th segment of a video, requests for data for that stream will arrive every 4 seconds. If the time to process 
a read request is 500 msec (with the requisite degree of confidence that the read request will complete in 500 msec) 

so then the request is scheduled for at least 500 msec before the antk:ipated receipt of request from the communication 
node 14. 



CI. CONTROL NODE 18 FUNCTIONS 



The control node 1 8 function is to provide an interface between media streamer 1 0 and the external world for control 
flow. It also presents a single system image to the external worid even if the media streamer 10 is itself implemented 
as a distributed system. The control node functk)ns are implemented by a defined Applk:atkxi Program Interface (API) 
. The API provkies f unctk)ns for creating the vkJeo content in media streamer 10 as well as for reaMime functions such 
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as playing/recording of video data. The control node 18 fonA/ards real-time requests to play or stop the video to the 
conrvnunicatlon nodes 14. 

C2. COMMUNICATION NODE 14 

A communication node 14 has the following threads (in the same process) dedicated to handle a real time video 
Interface: a thread to handle connect/disconnect requests, a thread to handle play/stop and pause/resume requests, 
and a thread to handle a jump request (seek forward or seek backward). In addition it has an input thread that reads 
data for a stream from the storage nodes 16 and an output thread that writes data to the output ports. 

A data fk>w structure In a communk:ation node 14 for handling data during the playing of a vkieo Is depicted in Fig. 
3. The data flow structure includes an input thread 1 00 that obtains data from a storage node 1 6. The input thread 1 00 
serializes receipt of data from storage nodes so that only one storage node Is sending data at any one time. The input 
thread 100 ensures that when an output thread 1 02 needs to write out of a buffer for a stream, the buffer is already filled 
with data. In addition, there is a scheduler function 104 that schedules both the input and output operations for the 
streams. This functk>n is used by both the input and output threads 100 and 102. 

Each thread works off a queue of requests. The request queue 106 for the output thread 102 contains requests that 
identify the stream and that points to an associated buffer that needs to be emptied. These requests are arranged in 
order by a time at which they need to be written to the video output interface. When the output thread 102 empties a 
buffer, it marks it as empty and invokes the scheduler function 104 to queue the request in an input queue 108 for the 
stream to the input thread (for the buffer to be filled). The queue 108 for the Input thread 100 is also arranged in order 
by a time at whteh buffers need to be filled. 

Input thread 100 also works off the request queue 108 an'anged by request time. Its task Is to fill the buffer from a 
storage node 16. For each request in its queue, the input thread 100 takes the following actions. The input thread 100 
determines the storage node 1 6 that has the next segment of data for the stream (the data tor a video stream is preferably 
striped across a number of storage nodes). The input thread 100 then sends a request to the determined storage node 
(using messages through switch 1 2) requesting data for the stream, and then waits for the data to arrive. 

This protocol ensures that only one storage node 16 will be sending data to a particular communbations node 14 
at any time, i.e., it removes the conflict that may arise if the storage nodes were to send data asynchronously to a 
communications node 14. When the requested data is received from the storage node 16, the input thread 100 marks 
the buffer as full and invokes the scheduler 1 04 to buffer a request (based on the stream's data rate) to the output thread 
102 to empty the buffer. 

0.3. STORAGE NODE 16 

v •.»The structure of the storage node-l 6 for data fiow.to support the playing of a stream is depicted in Fig: 4 The storage » 
node 16 has a pool of buffers that contain video data. It has an input thread 110 for each of the k>gk:al disk drives and 
an output thread 1 1 2 that writes data out to the communicatbns nodes 1 4 via the switch matrix 1 2. It also has a scheduler 
function 1 1 4 that is used by the input and output threads 1 1 0. 1 1 2 to schedule opeiatlons. It also has a message thread 
1 1 6 that processes requests from communications nodes 1 4 requesting data. 

When a message is received from a communications node 1 4 requesting data, the message thread 1 1 6 will normally 
find the requested data already buffered, and queues the request (queue 118) to the output thread. The requests are 
queued in time order. The output thread 1 1 2 will empty the buffer and add it to the list of free buffers. Each of the input 
threads 110 have their own request queues. For each of the active streams that have video data on the associated disk 
drive, a queue 120 ordered by request time (based on the data rate, level of striping, etc.) to fill the next buffer is main- 
tained. The thread takes the first request in queue 120, associates a free buffer with it and issues an I/O request to fill 
the buffer with the data from the disk drive. When the buffer is filled, it is added to the list of full buffers. This is the list 
thai is checked by the message thread 116 when the request for data for the stream is received. When a message for 
data is received from a communicatbn node 1 4 and the required buffer is not full. It is considered to be a missed deadline. 

C4 . JUST-IN-TIME SCHEDULING 

A just-in-time scheduling technque is used In both the communk:ations nodes 14 and the storage nodes 16. The 
technque employs the folbwing parameters: 

be = buffer size at the communicatk>ns node 14; 

bs ~ buffer size at the storage node 1 6; 
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r = video stream data rate; 

n = number of stripes of video containing the data for the video stream; 

s ST = stripe data rate; and 

sr = r/n. 



The algorithm used is as follows: 

10 

(1 ) sfc = frequency of requests at the communications node for a stream = r/bc; and 

(2) dfc = frequency of disk read requests at the Storage Node = sr/bs. 
The "striping' of video data is described in detail below in section H. 

The requests are scheduled at a frequency determined by the expressions given above, and are scheduled so that 
they complete in advance of when the data is needed. This is accomplished by 'priming* the data pipe with data at the 
start of playing a video stream. 

Calculations of sfc and dfc are made at connect time, in both the communication node 14 playing the stream and 
the storage nodes 16 containing the video data. The frequency (or its Inverse, the interval) is used in scheduling input 
from disk in the storage node 1 6 (see Fig. 4) and In scheduling the output to the port (and input from the storage nodes) 
in the communlcatk>n node 14 (see Fig. 3). 

Example of Just-In-Time Scheduling: 

Play a stream at 2.0 mbits/sec (250,000 bytes/sec.) from a vkJeo striped on tour storage nodes. Also assume that 
the buffer size at the communication node is 50,000 bytes and the buffer size at the disk node is 250,000 bytes. Also, 
assume that the data is striped In segments of 250.000 bytes/sec. 

The values for the various parameters in the Just-In-Time algorithm are as follows: 

be = 250.00 bytes (buffer size at the communication node 1 4); 

bs = 250,000 bytes (buffer size at the storage node) 1 6; 

..«,ri.M..«.35.u«,.«i;,=r,.,,. «.xa«^..250i000 byles/sec.(streamidata*rate);tv .^-t^. 

n = 4 (number of stripes that video for the stream is striped over); 

sr = r/n = 6250 bytes/sec. or 250,000/4 sec, i.e. 250,000 bytes every four seconds; 

40 

sfc = r/bc = 1/sec, (frequency of requests at the communk:ation node 14); and 

dfc = r/bs = 1/sec. (frequency of requests at the storage node 16). 

^ The communication node 14 responsible for playing the stream will schedule input and output requests at the fre- 

quency of 1/sec. or at intervals of 1 .0 seconds. Assuming that the communication node 14 has two buffers dedk:ated 
for the stream, the communk:ation node 14 ensures that it has both buffers filled before it starts outputting the video 
stream. 

At connect time the communication node 1 4 will have sent messages to all four storage nodes 1 6 containing a stripe 
so of the video data The first two of the storage nodes will anticipate the requests for the first segment from the stripes 
and will schedule disk requests to fill the buffers. The communication node 14 will schedule input requests (see Fig. 3) 
to read the first two segments into two buffers, each of size 250,000 bytes. When a play request comes, the communi- 
cation node 14 will first insure that the two buffers are full, and then informs all storage nodes 16 that play is about to 
commence. It then starts playing the stream. When the first buffer has been output (which at 2 Mbits/sec. (or 250,000 
55 bytes/sec.) will take one second), the communication node 14 requests data from a storage node 16. The communication 
node 14 then requests data from each of the storage nodes, In sequence, at intervals of one second, i.e. It will request 
data from a specific storage node at inten^als of four seconds. It always requests 250,000 bytes of data at a time. The 
cabulations for the frequency at whbh a communication node requests data from the storage nodes 16 is done by the 
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communication node 14 at connect time. 

The storage nodes 1 6 anticipate the requests for the stream data as follows. The storage node 1 6 containing stripe 
3 (see section H below) can expect a request for the next 250,000 byte segment one second after the play has com- 
menced, and every four seconds thereafter. The storage node 16 containing stripe 4 can expect a request two seconds 

s after the play has commenced and every four seconds thereafter. The storage node 16 containing stripe 2 can expect 
a request four seconds after play has commenced and four seconds thereafter. That is, each storage node 1 6 schedules 
the input from disk at a frequency of 250,000 bytes every four seconds from some starting time (as described above). 
The scheduling is accomplished in the storage node 16 after receipt of the play command and after a buffer for the 
stream has been output. The calculation of the request frequency is done at the time the connect request is received. 

10 It is also possible to use different buffer sizes at the communication node 1 4 and the storage node 1 6. For example, 
the buffer size at the communication node 14 may be 50,000 bytes and the buffer size at the storage node 16 may be 
250,000 bytes. In this case, the frequency of requests at the communication node 1 4 will be (250,000/50,000) 5/sec. or 
every 0.2 seconds, while the frequency at the storage node 16 will remain at 1/sec. The communication node 14 reads 
the first two buffers (100,000 bytes) from the storage node containing the first stripe (note that the segment size is 

IS 250,000 bytes and the storage node 16 containing the first segment will schedule the input from disk at connect time). 
When play commences, the communk:atk>n node 14 informs the storage nodes 16 of same and outputs the first buffer. 
When the buffer empties, the communication node 1 4 schedules the next input. The buffers will empty every 0.2 seconds 
and the communication node 14 requests input from the storage nodes 16 at that frequency, and also schedules output 
at the same frequency. 

20 In this example, storage nodes 1 6 can anticipate five requests to arrive at inten/als of 0.2 seconds (except for the 

first segment where 100,000 bytes have been already read, so initially three request will come after commencement of 
play every four seconds, i.e., the next sequence of five requests (each for 50,000 bytes) will arrive four seconds after 
the last request of the previous sequence). Since, the buffer size at the storage node is 250,000 bytes, the storage 
nodes 1 6 will schedule the Input from disk every four seconds (just as in the example above). 

25 

C.5. DETAILS OF A PLAY ACTION 



The following steps trace the control and data flow for the playing action of a stream. The steps are depbted in 
Figure 5 for setting up a video for play. The steps are in time order. 

30 

1 . The user invokes a command to setup a port with a specific video that has been prevksusly loaded. The request 
is sent to the control node 1 8. 

2. A thread in the control node 18 receives the request and a VS-CONNECT function. 

3. The control node thread opens a catalog entry for the video, and sets up a memory descriptor for the video with 
the striped file Infonmatbn. 

4. The control node 1 8 alkylates a communication node 14 and an output port on that node for the request. 

40 

5. Then control node 18 sends a message to the allocated communk:atk>n node 14. 

6. A thread in the conrvnunk:ation node 14 receives the message from the control node 18. 

^ 7. The communicatkxi node thread sends an open request to the storage node 16 containing the stripe files. 

8,9. A thread in each storage node 1 6 that the open request is sent to receive the request and opens the requested 
stripe file and allocate any needed resources, as well as scheduling input from disk (if the stripe file contains the 
first few segments). 

so 

10. The storage node thread sends a response back to the communicatbn node 14 with the handle (identifier) for 
the stripe file. 



11. The thread in the communication node 14 waits on responses from all of the storage nodes involved and on 
receiving successful responses alkx:ates resources for the stream, including setting up the output port. 

12. The communk:atk>n node 14 then schedules input to prime the video data pipeline. 
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13. The communication node 14 then sends a response back to the control node 18. 

14. The control node thread on receipt of a successful response from the communication node 14 returns a handle 
for the stream to the user be used in subsequent requests related to this instance of the stream. 

5 

The following are the steps in time order for the actbns that are taken on receipt of the play request after a video 
stream has been successfully set up. The steps are depk:ted in Fig. 6. 

1 . The user invokes the play command. 

10 

2. A thread in the control node 18 receives the request. 

3. The thread in the control node 18 verifies that the request Is for a stream that is set up, and then sends a play 
request to the allocated communicatbn node 1 4. 

15 

4. A thread in the communbation node 14 receives the play request. 

5. The communication node 14 sends the play request to all of the involved storage nodes 16 so that they can 
schedule their own operations in anticipation of subsequent requests for this stream. An "involved' storage node is 

20 one that stores at least one stripe of the VKteo presentatk)n of interest. 

6. A thread in each involved storage node 1 6 receives the request and sets up schedules for servteing future requests 
for the stream. Each involved storage node 16 sends a response back to the communk:ation node 14. 

2S 7. The communication node thread ensures that the pipeline is primed (preloaded with video data) and enables the 

stream for output. 

8. The communicatbn node 14 then sends a response back to the control node 18. 

30 9. The control node 1 8 sends a response back to the user that the stream is playing. 

The input and output threads continue to deliver the vbeo presentation to the specified port until a stop/pause 
command is received or the video completes. 

Media streamer 10 is a passive server, whbh perfomis vkJeo server operatk)ns when It receives control commands 
from an external control system. Figure 7 shows a system configuration for media streamer 1 0 applicatbns and illustrates 
the interfaces present in the system. 
40 Media streamer 1 0 provides two levels of interfaces for users and applicatk)n programs to control its operatkxis: 

a user interface ((A) in Fig. 7); and 
an application program interface ((B) in Fig. 7). 
Both levels of interface are provided on client control systems, which communicate with the media streamer 10 
through a remote procedure call (RFC) mechanism. By providing the interfaces on the client control systems, instead 
45 of on the media streamer 1 0, the separation of application software from media streamer 1 0 is achieved. This facilitates 
upgrading or replacing the media streamer 10, since it does not require changing or replacing the applcation software 
on the client control system. 

D1. USER COMMUNICATIONS 

SO 

Media streamer 1 0 provkies two types of user interfaces: 
a command line interface; and 
a graphical user interface. 

SS D1.1. COMMAND LINE INTERFACE 

The command line interface displays a prompt on the user console or interface (65,66 of Fig. 1 ). After the command 
prompt, the user enters a command, starting with a command keyword followed by parameters. After the command is 
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executed, the interface displays a prompt again and waits for the next command input. The media streamer command 
line Interface Is especially suitable for the following two types of operations: 

Batch Control: 

5 

Batch control involves starting execution of a command script that contains a series of video control commands. 
For example, in the broadcast industry, a command script can be prepared In advance to include pre-recorded, scheduled 
programs for an extended period of time. At the scheduled start time, the command script is executed by a single batch 
command to start broadcasting without further operator intervention. 

10 

Automatic Control: 

Automatic control involves executing a list of commands generated by a program to update/play materials stored 
on media streamer 1 0. For example, a news agency may load new materials into the media streamer 1 0 every day. An 
IS application control program that manages the new materials can generate media streamer commands (for example, 
Load, Delete. Unload) to update the media streamer 1 0 with the new materials. The generated commands may be piped 
to the command line interface for execution. 

01.2. GRAPHICAL USER INTERFACE 

20 

Fig. 8 is an example of the media streamer graphical user interface. The interface resembles the control panel of 
a video cassette recorder, which has control buttons such as Play, Pause, Rewind, and Stop. In addition, it also provides 
selection panels when an operation involves a selection by the user (for example, load requires the user to select a 
video presentation to be loaded.) TTie graphical user interface is especially useful for direct user interactions. 
^ A "Batch' button 130 and an "Import/Export" button 132 are included In the graphical user interface. Their functions 

are described bebw. 

D2. USER FUNCTIONS 

30 Media streamer 10 provides three general types of user functions: 

Import/Export; 
VCR-IIke play controls; and 
Advanced user controls. 

.f tiMtr:* c S*t.^;irt.>..:D2i1. IMPORT/EXPORT* H r: ■ iL <i- .■»..«*.%' ».'t'«i.''- -4*i!u.-*.l V. .J, cu.-jf!'« VH.r;. i*--5.fwwr < I . wn.j j'av.r. •r.;*'!-! I-.... , -.4 4.. * - i •«!<! k.. vit i- - 

Import/Export functions are used to move video data into and out of the media streamer 1 0. When a video is moved 
into media streamer 10 (Import) from the client control system, the source of the video data is specified as a file or a 
device of the client control system. The target of the video data is specified with a unique name within media streamer 
40 1 0, When a video is moved out of media streamer 1 0 (Export) to the client control system, the source of the video data 
is specified by its name within media streamer 10, and the target of the video data is specified as a file or a device of 
the client control system. 

In the Import/Export category of user functions, media streamer 10 also provides a "delete" function to renK>ve a 
video and a "get attributes" function to obtain infomnation about stored videos (such as name, data rate). 
4S To invoke Import/Export functions through the graphical user interface, the user clicks on the "Import/Export" soft 

button 132 (Fig. 8). This brings up a new panel (not shown) that contains "Import", "Export", "Delete", "Get Attribute" 
buttons to invoke the individual f unctkxis. 

D2.2. VCR-LIKE PLAY CONTROLS 

SO 

Media streamer 10 provides a set of VCR-like play controls. The media streamer graphical user interface in Fig. 8 
shows that the following functbns are available: Load, Eject, Play, Slow. Pause. Stop, Rewind. Fast Fonvard and Mute. 
These functions are activated by clicking on the corresponding soft buttons on the graphical user interface. The media 
streamer command line interface provkies a similar set of functions: 
ss Setup - sets up a vkJeo for a specific output port. Analogous to foading a video cassette into a VCR. 

Play - initiates playing a video that has been set up or resumes playing a video that has been paused. 

Pause - pauses playing a video. 

Detach - anabgous to ejecting a video cassette from a VCR. 
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Status ' displays the status of ports, such as which video is playing, elapsed playing time, etc. 

D2.3. ADVANCED USER CONTROLS 

s In order to support specific application requirements, such as the broadcasting industry, the present invention pro- 

vides several advanced user controls: 

Play list - set up multiple videos and their sequence to be played on a port 
Play length - limit the time a video will be played 
Batch operation - perform a list of operations stored in a command file. 
10 The Play list and Play length controls are accomplished with a "Load" button 1 34 on the graphical user Interface. 
Each "setup* command will specify a video to be added to the Pl^ list for a specific port. It also specifies a time limit 
that the video will be played. Fig. 9 shows the panel which appears in response to clicking on the "load" soft button 1 34 
on the graphical user interface to select a video to be added to the play list and to specify the time limit for playing the 
video. When the user clicks on a file name in the "Files' box 1 36, the name Is entered into "File Name" box 1 38. When 
IS the user clicks on the "Add" button 140, the file name in 'File Name" box 138 is appended to the 'Play List* box 142 
with its time limit and displays the current play list (with time limit of each vkieo on the play list). 

The batch operation is accomplished by using a "Batch" soft button 1 30 on the graphrcal user interface (see Fig. 8). 
When the "Batch" button 130 is activated, a batch selection panel is displayed for the user to select or enter the 
command file name (see Fig. 10). Pressing an "Execute" button 144 on the batch selection panel starts the execution 
20 of the commands in the selected command file. Fig. 10 Is an example of the "Batch* and 'Execute* operation on the 
graphical user interface. For example, the user has first created a command script in a file *batch2" in the c:/batchcmd 
directory. The user then clicks on 'Batch* button 1 30 on the graphical user interface shown in Fig. 8 to bring up the Batch 
Selection panel. Next, the user clicks on 'ci/batchcmd" in "Directory" box 146 of the Batch Selection panel. This results 
in the display of a list of files in "Files" box 148. Clicking on the "batch2" line in "Files" box 148 enters it into the "File 
2S Name" box 150. Finally, the user clk;ks on the "Execute* button 144 to execute in sequence the commands stored in 
the -batch2* file. 

D3. APPLICATION PROGRAM INTERFACE 

30 Media streamer 10 provkies the above-menttoned Applk:ation Program Interface (API) so that applk:ation control 

programs can interact with media streamer 10 and control its operatk)ns (reference may be made again to Fig. 7). 
The API consists of renrx>te procedure call (RPC)-based procedures. Applicatbn control programs invoke the API 

functions by making procedure calls. The parameters of the procedure call specify the functions to be performed. The 

applicatbn control programs invoke the API functions without regarding the logical and physical kx:atk)n of media stream- 
35 ...^ er -10: The identity ota media streamenlOto provide the video services is established at either the client control system* -i^U. .m, 

startup time or» optkxially, at the appltoatkxi control program initiatbn time. Once the identity of media streamer 10 is 

established, the procedure calls are directed to the correct media streamer 10 for sen^icing. 

Except as indicated below, API functions are processed synchronously, i.e., once a function call is returned to the 

caller, the function is completed and no additional processing at media streamer 10 is needed. By configuring the API 
^ functions as synchronous operatk>ns. additk)nal processing overtieads for context switching, asynchronous signalling 

and feedbacks are avoided. This performance is important in vkieo sender applk^atkxis due to the stringent real-time 

requirements. 

The processing of API functions is performed in the order that requests are received. This ensures that user oper- 
ations are processed in the correct order. For example, a video must be connected (setup) before it can be played. 
^ Another example is that switching the order of a "Play* request folbwed by a 'Pause' request will have a completely 
different result to the user. 

A VS-PLAY function initiates the playing of the vkieo and returns the control to the caller immediately (without waiting 
until the completion of the video play). The rationale for this architecture is that since the time for playing a video is 
typically bng (minutes to hours) and unpredictable (there may be pause or stop commands), by making the VS-PLAY 
so f unctk>n asynchronous, It frees up the resources that would othenwise be alkx:ated for an unpredrctably, bng period of 
time. 

At completk>n of video play, media streamer 10 generates an asynchronous call to a system/port address specified 
by the applrcatk>n control program to notify the application control program of the video completion event. The system/port 
address is specified by the application control program when it calls the API VS-CONNECT function to connect the 
ss video. It should be noted that the callback system/port address for VS-PLAY is specified at the individual video level. 
That means the application control programs have the freedom of directing video completbn messages to any control 
point For example, one application may desire the use of one central systerrVport to process the video completk>n 
messages for many or all of the client control systems. In another applicatbn, several different system/port addresses 



17 



EP 0 702 491 A1 



may be employed to process the video completion messages for one client control system. 

With the API architecture, media streamer 10 is enabled to support multiple concurrent client control systems with 
heterogeneous hardware and software platforms, with efficient processing of both synchronous and asynchronous types 
of operations, while ensuring the correct sequencing of the operation requests. For example, the media streamer 10 
nnay use an IBM OS/2 operating system running on a PS/2 system, while a client control system may use an TBM AIX 
operating system running on an RS/6000 system (IBM. OS/2. PS/2. AIX. and RS/6000 are ail trademarks of the Inter- 
national Business Machines Corporation). 

D4. CLIENT/MEDIA STREAMER COMMUNICATIONS 

Communications between a client control system and the media streamer 1 0 is accomplished through, by example, 
a known type of Remote Procedure Call (RPC) facility. Fig. 1 1 shows the RPC stmcture for the communicatfons between 
a client control system 1 1 and the media streamer 1 0. In calling media streamer functions, the client control system 1 1 
functbns as the RPC client and the media streamer 10 functions as the RPC sender. This is indicated at (A) in Fig. 11. 
However, for an asynchronous function, i.e., VS-PLAY, its completfon causes media streamer 10 to generate a call to 
the client control system 1 1 . In this case, the client control system 1 1 functbns as the RPC sender, while media streamer 
10 is the RPC client. This is indicated at (B) in Fig. 11. 

D4.1. CLIENT CONTROL SYSTEM 11 

In the client control system 11, the user command line interface is comprised of three internal parallel processes 
(threads). A first process parses a user command line input and performs the requested operatkxi by invoking the API 
functions, which result in RPC calls to the media streamer 10 ((A) In Figure 11). This process also keeps track of the 
status of videos being set up and played for various output ports. A second process periodically checks the elapsed 
2S playing time of each video against their specified time limit. If a video has reached its time limit, the vkJeo is stopped 
and disconnected and the next video in the wait queue (if any) for the same output port is started. A third process In the 
client control system 1 1 functions as an RPC sender to receive the VS-PLAY asynchronous termination notification from 
the media streamer 10 (B) In Fig. 11). 

30 D4.2 MEDIA STREAMER 10 

During startup of media streamer 10. two parallel processes (threads) are invoked in order to support the RPCs 
between the client control system(s) 11 and media streamer 10. A first process functions as an RPC server for the API 
function calls coming from the client control system 11 ((A) In Fig. 11). The first process receives the RPC calls and 
.»..dispatches»the appropriate procedures to perform the requested functions (suchas VSrCONNEGT^«VSrRL:AV;WSrDISr.«ffl««:^.^.t/a; .r..., 
CONNECT). A second process f unctk>ns as an RPC client for calling the appropriate client control system addresses 
to ncAify the applbation control programs with asynchronous terminatbn events. The process blocks itself waiting on 
an internal pipe, which is written by other processes that handle the playing of videos. When the latter reaches the end 
of a video or an abnormal termination condition, It writes a message to the pipe. The blocked process reads the message 
and makes an RPC call ((B) In Fig. 1 1 to the appropriate client control system 11 port address so that the client control 
system can update its status and take actions accordingly. 

E. MEDIA STREAMER MEMORY ORGANIZATION AND OPTIMIZATION FOR VIDEO DELIVERY 

An aspect of this invention provides integrated mechanisms for tailoring cache management and related I/O oper- 
atk>ns to the video delivery environment. This aspect of the invention is now described in detail. 

El. PRIOR ART CACHE MANAGEMENT 

Prior art mechanisms for cache management are built Into cache controllers and the file subsystems of operating 
systems. They are designed for general purpose use. and are not specialized to meet the needs of video delivery. 

Fig. 12 Illustrates one possible way in which a conventional cache management mechanism may be configured for 
video delivery. This technique employs a video split between two disk files 160, 162 (because It is too large for one file), 
and a processor containing a file system 1 64, a media sewer 1 68, and a video driver 1 70. Also Illustrated are two video 
ss adapter ports 172, 174 for two video streams. Also illustrated is the data flow to read a segment of disk file 160 into 
main storage, and to subsequently write the data to a first video port 172, and also the data flow to read the same 
segment and write It to a second video port 1 74. Fig. 1 2 is used to illustrate problems incurred by the prbr art which are 
addressed and overcome by the media streamer 10 of this invention. 
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Description of steps A1-A12 in Fig. 12. 

A1 . Media server 1 68 calls file system 1 66 to read segment Sk into a buffer in video driver 1 70. 
s fi2. File system 166 reads a part of Sk into a cache buffer in file system 166. 

A3. File system 166 copies the cache buffer into a buffer in video driver 170. 

Steps A2 and A3 are repeated multiple times. 

10 

A4. File system 1 66 calls vkJeo driver 1 70 to write Sk to video port 1 (176). 
AS. Video driver 170 copies part of Sk to a buffer In video driver 170. 
IS A6. Video driver 1 70 writes the buffer to vkieo port 1 (1 76). 

Steps A5 and A6 are repeated multiple times. 

Steps A7-A1 2 function in a similar manner, except that port 1 is changed to port 2. If a part of Sk is in the cache in 
file system 166 when needed for port 2, then step A8 may be skipped. 

As can be realized, video delivery involves massive amounts of data being transferred over multiple data streams. 
The overall usage pattern fits neither of the two traditional patterns used to optimize caching; rarKkxn and sequential. 
If the random option is selected, most cache buffers will probably contain data from video segments which have been 
recently read, but will have no video stream in line to read them before they have expired. If the sequential option is 
chosen, the most recently used cache buffers are re-used first, so there is even less chance of finding the needed 
segment part in the file system cache. As was described previously, an important element of video delivery is that the 
data stream be delivered isochronously, that is without breaks and inten'uptkxis that a viewer or user would find objec- 
tkxiable. Prior art caching mechanisms, as just shown, cannot ensure the isochronous delivery of a video data stream 
to a user. 

Additional problems illustrated by Fig. 12 are: 

a. Disk and video port I/O is done in relatively snnall segments to satisfy general file system requirements. This 
requires more processing time, disk seek overiiead, and bus overhead than woukJ be required by vkJeo segment 
size segments. 

. . rr. . .... .35^ « t. w b. sThe processing^ime. to copy^data between4he file system cache buff ers and media sefvertbuffersv^and'between,i*«»»ffSMr...»«^f™ 

media server buffers and video driver buffers, is an undesirable overhead that it would be desirable to eliminate. : 

c. Using two video buffers (i.e. 172, 174) to contain copies of the same video segment at the same time is an 
inefficient use of main memory. There is even more waste when the same data is stored in the file system cache 
^ and also in the video driver buffers. 

E2. VIDEO-OPTIMIZED CACHE MANAGEMENT 

There are three principal facets of the cache management operation in accordance with this aspect of the invention: 
^ sharing segment size cache buffers across streams; predictive caching; and synchronizing to optimize caching. 

E2.1. SHARING SEGMENT SIZE CACHE BUFFERS ACROSS STREAMS 

Videos are stored and managed in fixed size segments. The segments are sequentially numbered so that, for ex- 
so ample, segment 5 would store a portion of a video presentation that is nearer to the beginning of the presentation than 
would a segment numbered 6. The segment size is chosen to optimize disk I/O, video I/O, bus usage and processor 
usage. A segment of a video has a fixed content, which depends only on the vkJeo name, and the segment number. All 
I/O to disk and to the vkieo output, and all caching operations, are done aligned on segment boundaries. 

This aspect of the invention takes two forms, depending on whether the underlying hardware supports peer-to-peer 
SS operations with data flow directly between disk and video output card in a communications node 14, without passing 
through cache memory in the communications node. For peer-to-peer operatbns, caching is done at the disk storage 
unit 1 6. For hardware whbh does not support peer-to-peer operatbns, data is read directly into page-aligned, contiguous 
cache memory (in a communicatk)ns node 14) in segment^ized bkxks to minimize I/O operations and data movement. 
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(See F. Video Optimized Digital Memory Allocation, below). 

The data remains in the same location and is written directly from this location until the video segment is no bnger 
needed. While the video segment is cached, all video streams needing to output the video segment access the same 
cache buffer. Thus, a single copy of the video segment is used by many users, and the additional I/O, processor, and 
s buffer memory usage to read additional copies of the same video segment is avoided. For peer to peer operations, half 
of the remaining I/O and almost all of the processor and main memory usage are avoided at the communication nodes 1 4. 

Fig. 13 illustrates an embodiment of the invention for the case of a system without peer-to-peer operations. The 
video data is striped on the disk storage nodes 1 6 so that odd numbered segments are on first disk storage node 180 
and even numbered segments are on second disk storage node 182 (see Section H below). 
10 The data flow for this configuration is also illustrated in Fig. 1 3. As can be seen, segment Sk is to be read from disk 

182 into a cache buffer 184 in communication node 186, and is then to be written to video output ports 1 and 2. The SK 
video data segment is read directly into cache buffer 184 with one I/O operation, and is then written to port 1 . Next the 
SK video data segment is written from cache buffer 184 to port 2 with one I/O operation. 

As can be realized, all of the problems described for the conventional approach of Fig. 12 are overcome by the 
IS system illustrated in Fig. 1 3. 

Fig. 14 illustrates the data flow for a configuration containing support for peer-to-peer operations between a disk 
storage node and a video output card. A pair of disk drives 190, 192 contain a striped vkJeo presentatkxi which is fed 
directly to a pair of vkleo ports 194, 196 without passing through the main memory of an inten/ening communk:atk)n 
node 14. 

20 The data flow for this configuration is to read segment Sk from disk 1 92 directly to port 1 (with one I/O operatkm) 

via disk cache buffer 198. 

If a call follows to read segment SK to port 2, segment Sk is read directly from disk cache buffer 198 into port 2 
(with one I/O operation). 

When the data read into the disk cache buffer 1 98 for port 1 is still resident for the write to port 2, a best possible 
2S use of memory, bus, and processor resources results in the transfer of the video segment to ports 1 and 2. 

It is possible to combine the peer to peer and main memory caching mechanism, e.g., using peer to peer operatkms 
for video presentatk>ns whk;h are playing to only one port of a communication node 14, and caching in the communi- 
cations node 14 for video presentations which are playing to multiple ports of the communication node 14. 

A policy for dividing the caching responsibility between disk storage nodes and the communication node is chosen 
30 to maximize the number of video streams which can be supported with a given hardware configuration. If the number 
of streams to be supported known, then the amount and placement of caching storage can then be determined. 

E2.2. PREDICTIVE CACHING 
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tions are in general very predk:table. Typically, they start playing at the beginning, play at a fixed rate for a fairly lengthy 
predetermined period, and stop only when the end is reached. The caching approach of the media streamer 10 takes 
advantage of this predictability to optimize the set of video segments which are cached at any one time. 

The predictability is used both to schedule a read operation to fill a cache buffer, and to drive the algorithm for 
^ reclaiming of cache buffers. Buffers whose contents are not predicted to t>e used before they would expire are reclaimed 
immediately, freeing the space for higher priority use. Buffers whose contents are in line for use within a reasonable 
time are not reclaimed, even if their last use was long ago. 

More particularly, given videos v1 , v2, ..., and streams si , s2,... playing these videos, each stream sj plays one video, 
v(sj), and the time predicted for writing the k-th segment of v(sj) is a linear function: 
45 t(sj,k) = a(sj) + r(sj)k, 

where a(sj) depends on the start time and starting segment number. r(sj) is the constant time it takes to play a 
segment, and t(sj,k) is the scheduled time to play the k-th segment of stream sj. 

This information is used both to schedule a read operation to fill a cache buffer, and to drive the algorithm for re-using 
cache buffers. Some examples of the operation of the cache management algorithm follow: 

so 

EXAMPLE A 



A cache buffer containing a video segment which Is not predicted to be played by any of the currently playing video 
streams is re-used before re-using any buffers which are predicted to be played. After satisfying this constraint, the 
ss frequency of playing the video and the segment number are used as weights to determine a priority for keeping the 
video segment cached. The highest retention priority within this group is assigned to vkJeo segments that occur eariy 
in a frequently played vkieo. 
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EXAMPLE B 



For a cache buffer containing a video segment which is predicted to be played, the next predicted play time and the 
number of streams left to play the video segment are used as weights to determine the priority for keeping the video 
segment cached. The weights essentially allow the retention priority of a cache buffer to be set to the difference between 
the predicted number of t/Os (for any video segment) with the cache buffer reclaimed, and the predicted number with it 
retained. For example, if 
v5 is playing on s7, 

v8 is playing on s2 and s3, with s2 running 5 seconds behind s3, and 

v4 is playing on streams si 2 to s20 with each stream 30 seconds behind the next, 

then: 

buffers containing v5 data already used by s7 are reclaimed first, followed by buffers containing v8 data already 
used by s2, followed by buffers containing v4 data already used by s12, followed by remaining buffers with the bwest 
retention priority. 

The cache management algorithm provides variations for special cases such as connection operations (where it is 
possible to predict that a video segment will be played in the near future, but not exactly when) and stop operations 
(when previous predictions must be revised). 



E2.3. SYNCHRONIZING STREAMS TO OPTIMIZE CACHING 

20 

It is desirable to cluster all streams that require a given video segment, to minimize the time that the cache buffer 
containing that segment must remain in storage and thus leave more of the system capacity available for other video 
streams. For video playing, there is usually little flexibility in the rate at which segments are played. However, in some 
application of video delivery the rate of playing is flexible (that is, video and audio may be accelerated or decelerated 

2S slightly without evoking adverse human reactions). Moreover, videos may be delivered for purposes other than immediate 
human viewing. When a variatkjn in rate is allowed, the streams out in front (timewlse) are played at the minimum 
albwable rate and those in back (timewise) at a maximum allowable rate in order to close the gap between the streams 
and reduce the time that segments must remain buffered. 

The clustering of streams using a same video presentation is also taken into account during connection and play 

30 operations. For example, VS-PLAY-AT-SIGNAL can be used to start playing a video on multiple streams at the same 
time. This improves clustering, leaving more system resources for other video streams, enhancing the effective capacity 
of the system. More specifically, clustering, by delaying one stream for a short period so that it coincides in time with a 
second stream, enables one copy of segments in cache to be used for both streams and thus consen^es processing 
assets. 

F. VIDEO OPTIMIZED DIGITAL MEMORY ALLOCATION 



Digital video data has attributes unlike those of normal data processing data in that it is non-random, that is sequen- 
tial, large, and time critical rather than content critical. Multiple streams of data must be delivered at high bit rates. 
40 requiring all nonessential overhead to be minimized in the data path. Careful buffer management is required to maximize 
the efficiency and capacity of the media streamer 10. Memory allocatbn, deallocation, and access are key elements in 
this process, and improper usage can result in memory fragmentation, decreased effrciency. and delayed or corrupted 
video data. 

The media streamer 10 of this invention empbys a memory alkx:ation procedure which allows high level applications 
45 to allocate and deallocate non-swappable, page aligned, contiguous memory segments (blocks) for digital video data. 
The procedure provides a simple, high level interface to vkieo transmission applications and utilizes low level operating 
system modules and code segments to allocate memory bkx:ks in the requested size. The memory blocks are contiguous 
and fixed in physical memory, eliminating the delays or corruption possible from virtual memory swapping or paging, 
and the complexity of having to implement gather/scatter routines in the data transmission software. 
so The high level interface also returns a variety of addressing mode values for the requested memory block, eliminating 

the need to do costly dynamic address conversion to fit the varbus memory models that can be operating concurrently 
in a media streamer environment. The physical address is available for direct access by other devbe drivers, such as 
a fixed disk device, as well as the process linear and process segmented addresses that are used by various applicatbns. 
A dealkx:atk>n routine is also provided that returns a memory block to the system, eliminating fragmentation problems 
since the memory is all returned as a single bkx^k. 



21 



EP 0 702 491 A1 

F. I. COMMANDS EMPLOYED FOR MEMORY ALLOCATION 

1. Allocate Physical Memory: . 

s Allocate the requested size memory block, a control block Is returned with the varbus memory model addresses of 

the merTK>ry area, along with the length of the block. 

2. Deallocate Physical Memory: 

10 Return the memory block to the operating system artd free the associated memory pointers. 

F2. APPLICATION PROGRAM INTERFACE 

A device driver is defined in the system configuration files and is automatically initialized as the system starts. An 
IS applicatbn then opens the devk:e driver as a pseudo device to obtain Its label, then uses the interface to pass the 
commands and parameters. The supported commands are Allocate Memory and Dea)kx:ate Memory, the parameters 
are memory size and pointers to the logical memory addresses. These addresses are set by the device driver once the 
physical block of memory has been allocated and the physical address is converted to k>gk:al addresses. A null is 
retumed if the allocatk>n fails. 

20 Fig. 1 5 shows a typk:al set of applicatbns that would use this procedure. Buffer 1 is requested by a 32-bit applk:ation 

for data that is modified and then placed into buffer 2. This buffer can then be directly manipulated by a 1 6 bit applk:atk>n 
using a segmented address, or by a physical devk;e such as a fixed disk drive. By using this albcatbn scheme to 
preal locate the fixed, physical, and contiguous buffers, each application is enabled to use it's native direct addressing 
to access the data, eliminating the address translation and dynamic memory allocation delays. A video application may 

2S use this approach to minimize data nrK>vement by placing the digital vkieo data in the buffer directly from the physk^al 
disk, then transferring it directly to the output device without moving it several times in the process. 

G. DISK DRIVE OPTIMIZED FOR VIDEO APPLICATIONS 

30 It is important that vkJeo streams be delivered to their destination isochronously, that is without delays that can be 

perceived by the human eye as discontinuities in movement or by the ear as interruptions in sound. Current disk tech- 
nobgy may involve periodic actk>n, such as performing predkstive failure analysis that may cause significant delays in 
data access. While most I/O operatkxis complete within 100 ms, perkxJk: delays of 100 ms are common and delays of 
three full seconds can occur. 

r .n4-^^rv^>^«^wd5. v.^:..r<;.,.The mediastreame^lO must also be capable of efficiently sustaining high data transfer rates: A disk drive eonfiguredv:t.-.<r^iWjftr<,j 
for general purpose data storage and retrieval will suffer inefficiencies in the use of metTK)ry, disk buffers, SCSI bus and 
disk capacity if not optimized for the video server applicatbn. 

In accordance with an aspect of the inventton, disk drives employed herewith are taitored for the role of snrKX)th and 
timely delivery of large amounts of data by optimizing disk parameters. The parameters may be incorporated into the 
40 manufacture of disk drives specialized for vkJeo servers, or they may be variables that can be set through a command 
mechanism. 

Parameters controlling periodic actions are set to minimize or eliminate delays. Parameters affecting buffer usage 
are set to allow for transfer of very large amounts of data In a single read or write operation. Parameters affecting speed 
matching between a SCSI bus and a processor bus are tuned so that data transfer starts neither too soon nor too late. 
45 The disk media itself is formatted with a sector size that maximizes effective capacity and band-width. 

To accomplish optimization: 

The physical disk media is formatted with a maximum allowable physical sector size. This formatting option mini- 
mizes the amount of space wasted in gaps between sectors, maximizes device cap^ity, and maximizes the burst data 
rate. A prefered implementatbn is 744 byte sectors. 

so Disks may have an associated buffer. This buffer is used for reading data from the disk media asynchronously from 

availability of the bus for the transfer of the data. Likewise the buffer is used to hold data arriving from the bus asyn- 
chronously from the transfer of that data to the disk media. The buffer may be divided into a number of segments and 
the number is controlled by a parameter. If there are too rr^ny segments, each may be too small to hold the amount of 
data requested in a single transfer. When the buffer is full, the device must initiate reconnection and begin transfer; if 

S6 the bus/device is not available at this time, a rotational delay will ensue. In the preferred implementatkxi, this value is 
set so that any buffer segment is at least as large as the data transfer size, e.g., set to one. 

As a buffer segment begins to fill on a read, the disk attempts to reconnect to the bus to effect a data transfer to the 
host. The point in time that the disk attempts this reconnectbn affects the effbiency of bus utilization. The relative speeds 
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of the b us and the disk determine the best point in time during the fill operation to begin data transfer to the host. Likewise 
during write operations, the buffer will fill as data arrives from the host and, at a certain point in the fill process, the disk 
should attempt a reconnection to the bus. Accurate speed matching results in fewer disconnect/reselect cycles on the 
SCSI bus with resulting higher maximum throughput 

s The parameters that control when the reconnectk)n is attempted are called "read buffer full ratk)" and "write buffer 

empty ratk>'. For video data, the preferred algorithm for calculating these ratbs In 256 x (Instantaneous SCSI Data 
Transfer Rate - Sustainable Disk Data Transfer Rate) / Instantaneous SCSI Data Transfer Rate. Presently preferred 
values for buffer-full and buffer-empty ratios are approximately 204. 

Some disk drive designs require periodic recalibration of head positbn with changes in temperature. Some of these 

10 disk drive types further albw control over whether thermal compensatkin is done for all heads in an assembly at the 
same time, or whether themnal compensatbn is done one head at a time. If all heads are done at once, delays of 
hundreds of milliseconds during a read operatbn for video data may ensue. Longer delays In read times results in the 
need for larger main memory buffers to smooth data flow and prevent artifacts in the multimedia presentation. The 
preferred approach is to program the Thermal Compensatbn Head Control function to allow compensation of one head 

IS at a time. 

The saving of error logs and the performance of predictive failure analysis can take several seconds to complete. 
These delays cannot be tolerated by vkieo server applications without very large main memory buffers to snrKX)th over 
the delays and prevent artifacts in the multimedia presentation. Limit Idle Time Function parameters can be used to 
inhibit the saving of error logs and performing idle time functions. The prefen-ed implementation sets a parameter to limit 
20 these f unctkws. 



H. DATA STRIPING FOR VIDEO DATA 



I n video applicatbns, there is a need to deliver multiple streams from the same data (e.g.. a movie). This requirement 
2S translates to a need to read data at a high data rate; that Is. a data rate needed for delivering one stream multiplied by 
the number of streams simultaneously accessing the same data. Conventbnally, this problem was generally solved by 
having multiple copies of the data and thus resulted in additional expense. The media streamer 10 of this Invention uses 
a technique for sen/ing many simultaneous streams from a single copy of the data. The technique takes into account 
the data rate for an individual stream and the number of streams that may be simultaneously accessing the data. 
30 The above-mentioned data striping involves the concept dt a bgical file whose data is partlttoned to reside in multiple 

file components, called stripes. Each stripe is allowed to exist on a different disk volume, thereby altowing the togbal 
file to span multiple physical disks. The disks may be either local or remote. 

When the data is written to the logical file, it is separated into logical lengths (i.e. segments) that are placed sequen- 
tially into the stripes. As depicted in Fig. 16, a logical file for a video, video 1 , is segmented into M segments or blocks 
.*vrw .«m':iv.>:M^;«t.-»each of a specific size, e;g:'256XB.^The<last^segment-rnay only be partially-filled with data:< A«seg^^ 

in the first stripe, followed by a next segment that is placed in the second stripe, etc. When a segment has been written 
to each of the stripes, the next segment is written to the first stripe. Thus, If a file Is being striped into N stripes, then 
stripe 1 will contain the segments 1. N+1, 2*N+1, etc.. and stripe 2 will contain the segments 2, N+2, 2*U+2, etc., and 
so on. 

<o A similar striping of data is known to be used in data processing RAID arrangements, where the purpose of striping 

is to assure data integrity in case a disk is lost. Such a RAID storage system dedicates one of N disks to the storage of 
parity data that Is used when data recovery Is required. The disk storage nodes 1 6 of the media streamer 1 0 are organized 
as a RAID-like structure, but parity data is not required (as a copy of the video data is available from a tape store). 
Fig. 17 illustrates a first important aspect of this data arrangement, i.e., the separation of each video presentation 

^ into data blocks or segments that are spread across the available disk drives to enable each video presentation to be 
accessed simultaneously from multiple drives without requiring multiple copies. Thus, the concept is one of striping, not 
for data integrity reasons or performance reasons, per se, but for concurrency or bandwidth reasons. Thus, the media 
stream 10 stripes video presentation by play segments, rather than by byte block, etc. 

As is shown in Fig. 17, where a video data file 1 is segmented into M segments and split into four stripes, stripe 1 

so is a file containing segments 1 , 5, 9, etc. of video file 1 ; stripe 2 is a file containing segments 2, 6, 10. etc.. of vkieo file 
1 , stripe 3 is a file containing segments 3, 7. 11 , etc. of the video file and stripe 4 is a file containing the segments 4, 8. 
1 2, etc.. of vkJeo file 1 , until all M segments of video file 1 are contained In one of the four stripe files. 

Given the described striping strategy, parameters are computed as follows to customize the striping of each indi- 
vidual video. 

ss First, the segment size is selected so as to obtain a reasonably effective data rate from the disk. However, it cannot 

be so large as to adversely affect the latency. Further it should be small enough to buffer/cache In memory. A preferred 
segment size is 256KB, and Is constant for video presentations of data rates in ranges from 128KB/sec. to 512KB/sec. 
If the video data rate is higher, then it may be preferable to use a larger segment size. The segment size depends on 
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the basic unit of I/O operation for the range of video presentations stored on the same media. The principle employed 
Is to use a segment size that contains approximately 0.5 to 2 seconds of video data. 

Next, the number of stripes, i.e. the number of disks over which video data is distributed, is determined. This number 
must be large enough to sustain the total data rate required and is computed individually for each video, presentation 
based on an anticipated usage rate. More specifically, each disk has a logical volume associated with it. Each video 
presentation is divkJed into component files, as many components as the number of stripes needed. Each component 
file is stored on a different logbal volume. For example, if video data has to be delivered at 250 KB/sec per stream and 
30 simultaneous streams are supported from the same video, started at say 15 second intervals, a total data rate of at 
least 7.5 MB/sec is obtained. If a disk drive can support on the average 3 MB/sec., at least 3 stripes are required for the 
vkieo presentatbn. 

The effective rate at whk:h data can be read from a disk is influenced by the size of the read operatkxi. For example, 
if data is read from the disk in 4KB blocks (from random positions on the disk), the effective data rate may be 1 MB/sec. 
whereas if the data is read in 256KB blocks the rate may be 3 MB/sec. However, if data is read in very large blocks, the 
memory required for buffers also increases and the latency, the delay in using the data read, also increases because 
the operatton has to complete before the data can be accessed. Hence there is a trade-off in selecting a size for data 
transfer. A size is selected based on the characteristics of the devices and the memory conflguratbn. Preferably, the 
size of the data transfer is the selected segment size. For a given segment size the effective data rate from a devtee is 
determined. For example, for some disk drives, a 256KB segment size provides a good balance for the effective use of 
the disk drives (effective data rate of 3 MB/sec.) and buffer size (256 KB). 

If striping is not used, the maximum number of streams that can be supported is limited by the effective data rate 
of the disk, e.g. if the effective data rate is 3MB/s and a stream data rate Is 200KB/S, then no more than 1 5 streams can 
be supplied from the disk. If, for instance, 60 streams of the same vkJeo are needed then the data has to be duplicated 
on 4 disks. However, if striping is used in accordance with this invention, 4 disks of 1/4 the capacity can be used. Fifteen 
streams can be simultaneously played from each of the 4 stripes for a total of 60 simultaneous streams from a single 
copy of the video data. The start times df the streams are skewed to ensure that the requests for the 60 streams are 
evenly spaced among the stripes. Note also that if the streams are started close to each other, the need for I/O can be 
reduced by using video data that is cached. 

The number of stripes for a given video is influenced by two factors, the first is the maximum number of streams 
that are to be supplied at any time from the video and the other is the total number of streams that need to be supplied 
at any time from all the videos stored on the same disks as the video. 
The number of stripes (s) for a video is determined as folk>ws: 

s = maximum (r*n/d, r*m/d), 

where: 

wM-'j^tu'^m- j^.irt • «35:w%'ii: r^«*f>t'4n..At&*jnominal'data«rate'attWhk^h4he'iStream<'is>to<be played|T.~i.'-3i.i>t »<ru«ai.* l-^i.* MAir, ('itfr>u r:^>!»i- tnfvm « i:>jur«»*t-«sv-jrf>nij«*«.'>n>ir] •v«i!«.»asMj'i*»i«*i**iivi-fW7tjr» is..^jr.i>win-«ir:» 
n = maximum number of simultaneous streams from this video presentatkxi at the nominal rate; 
d = effective data rate from a disk 

40 

(Note that the effective data rate from disk is influenced by the segment size); 

m = maximum number of simultaneous streams at nominal rate from all disks that contains any part of this vkieo; 
presentation; and 

45 

s = number of stripes for a video presentatk>n. 

The number of disks over which data for a video presentation is striped are managed as a set. and can be thought 
of as a very large physical disk. Striping allows a video file to exceed the size limit of the largest file that a system's 
so physical file system will allow. The video data, in general, will not always require the same amount of storage on all the 
disks in the set. To balance the usage of the disk, when a video is striped, the striping is begun from the disk that has 
the rTK>st free space. 

As an example, consider the case of a video presentation that needs to be played at 2 mbits/sec. (250,000 
bytes/sec.), i.e., r is equal to 250,000 bytes/sec., and assume that it is necessary to deliver up to 30 simultaneous 
S5 streams from this video, i.e., n is 30. Assume in this example, that m is also 30. i.e., the total number of streams to be 
delivered from all disks Is also 30. Further, assume that the data is striped in segments of 250,000 bytes and that the 
effective data rate from a disk for the given segment size (250.000 bytes) is 3^000,000 bytes/sec. Then n, the number 
of stripes needed, is (250,000 * 30 / 3.000,000) 2.5 whk;h is rounded up to 3 (s = ceiling(r*n/d)). 
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If the maximum number of streams from all disks that corttain this data is, for instance 45, then 250,000 * 45 / 

3.000. 000 or 3.75 stripes and needed, which Is rounded up to 4 stripes. 

Even though striping the video into 3 stripes is sufficient to meet the requirement for delivering the 30 streams from 
the single copy of the video, if disks containing the video also contain other content, and the total number of streams 

s from that video to be supported is 45, then four disk drives are needed (striping level of 4). 

The manner in which the algorithm is used in the media streamer 10 is as follows. The storage (number of disk 
drives) is divided into groups of disks. Each group has a certain capacity and capability to deliver a given number of 
simultaneous streams (at an effective data rate per disk based on a predetermined segment size). The segment size 
for each group is constant. Different groups may choose different segments sizes (and hence have different effective 

10 data rates). When a video presentation is to be striped, a group is first chosen by the following criteria. 

The segment size is consistent with the data rate of the vkieo, i.e., if the stream data rate is 250,000 bytes/sec., the 
segment size is in the range of 125K to 500 KB. The next criteria is to ensure that the number of disks in the group are 
sufficient to support the maximum number of simultaneous streams, i.e., the number of disks where V is the stream 
data rate and "n' the maximum number of simultaneous streams, and 'd' the effective data rate of a disk in the group. 

IS Finally, it should be insured that the sum total of simultaneous streams that need to be supported from all of the vkJeos 
in the disk group does not exceed its capacity. That is, if *m' is the capacity of the group, the *m - n' should be greater 
than or equal to the sum of all the streams that can be played simultaneously from the videos already stored in the group. 

The calculation is done in control node 18 at the time the video data is loaded Into the media streamer 10. In the 
simplest case all disks will be in a single pool which defines the total capacity of the media streamer 10, both for storage 

20 and the number of supportable streams. In this case the number of disks (or stripes) necessary to support a given 
number of simultaneous streams is calculated from the formula m*r/d. where m is the number of streams, r is the data 
rate for a stream, and d is the effective data rate for a disk. Note that if the streams can be of different rates, then m*r, 
in the above formula, should be replaced by: Max (sum of the data rates of all simultaneous streams). 

The result of using this technique for writing the data is that the data can be read for delivering many streams at a 

25 specified rate without the need for multiple copies of the digital representation of the video presentatk3n. By striping the 
data across multiple disk volumes the reading of one part of the file for delivering one stream does not interfere with the 
reading of another part of the file for delivering another stream. 

I. MEDIA STREAMER DATA TRANSFERS AND CONVERSION PROCEDURES 

30 

1.1. DYNAMIC BANDWIDTH ALLOCATION FOR VIDEO DELIVERY TO THE SWITCH 18 



Conventionally video servers generally fit one of two profiles. Either they use PC technotogy to build a low cost (but 
also low bandwidth) video server or they use super-computing technology to build a high bandwidth (also expensive) 
3S,'T, ; r video sen/en A object of this inventkxi^thenis to deliver^high bandwkith^vkjeo/butwithoutth&high>coslof iSuperT<x>nr^ 
technobgy. 

A preferred approach to achieving high bandwidth at low cost is to use the low latency switch (crossbar circuit switch 
matrix) 18 to interconnect low cost PC based 'nodes' into a video server (as shown in Fig. 1). An important aspect of 
the media streamer architecture is efficient use of the video stream bandwidth that is available in each of the storage 
40 nodes 1 6 and communk:ation nodes 1 4. The bandwidth is maximized by combining the special time bandwklth alkx:ation 
capability of a low-cost switch technok>gy. 

Fig. 1 8 shows a conventkxial k>gical connectkxi between a switch Interface and a storage node. The switch interface 
must be full duplex (i.e., information can be sent in either direction simultaneously) to allow the transfer of video (and 
control information) both into and out of the storage node. Because video content is written to the storage node once 
^ and then read many times, most of the bandwidth requirements for the storage node are in the directbn towards the 
switch. In the case of a typical switch interface, the bandwkith of the storage node is under-utilized because that half of 
the bandwidth devoted to write capability is so infrequently used. 

Fig. 19 shows a switch interface in accordance with this invention. This interface dynamically allocates its total 
bandwidth in real time either into or out of the switch 18 to meet the current demands of the node. (The storage node 
so 16 is used as an example.) The communbatbn nodes 14 have similar requirements, but mosX of their bandwidth is in 
the direction from the switch 18. 

The dynamb alkx^atkxi is achieved by grouping two or nrK>re of the physical switch interfaces, using appropriate 
routing headers for the switch 12, into one logical switch interface 18a. The video data (on a read, for example) is then 
split between the two physical interfaces. This is facilitated by striping the data across multiple storage units as described 
ss previously. The receiving node combines the video data back into a single logical stream. 

As an example, in Fig. 18 the switch interface is rated at 2X MB/sec. full duplex i.e., X MB/sec. in each direction. 
But vkieo data is usually sent only in one directkxi (from the storage node into the switch). Therefore only X MB/sec. of 
vbeo bandwidth Is delivered from the storage node, even though the node has twbe that capability (2X). The storage 
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node is under utilized. The switch interface of Fig. 19 dynamically allocates the entire 2X MB/sec. bandwidth to trans- 
mitting video from the storage node into the switch. The result is increased bandwidth from the node, higher bandwidth 
from the video sender, and a lower cost per video stream. 

s J. ISOCHRONOUS VIDEO DATA DELIVERY USING COMMUNICATIONS ADAPTERS 



Digital video data is sequential, continuous, large, and time critical, rather than content critical. Streams of video 
data must be delivered isochrcnousty at high bit rates, requiring all nonessential overhead to be minimized in the data 
path. Typically, the receiving hardware is a video set top box or some other suitable video data receiver. Standard serial 

10 communication protocols insert additional bits and bytes of data into the stream for synchronization and data verification, 
often at the hardware level. This corrupts the video data stream if the receiver is not able to transparently remove the 
additional data. The additional overhead introduced by these bits and bytes also decreases the effective data rate which 
creates video decompression and conversion errors. 

It has been determined that the transmission of video data over standard communications adapters, to ensure 

IS isochronous delivery to a user, requires disabling most of the standard serial communications protocol attributes. The 
methods for achieving this vary depending on the communications adapters used, but the folbwing describes the un- 
derlying concepts. In Fig. 20, a serial communications chip 200 in a communications node 14 disables data formatting 
and integrity information, such as the parity, start and stop bits, cyclic redundancy check codes and sync bytes, and 
prevents idle characters from being generated. Input FIFO buffers 202, 204, 206, etc. are empbyed to insure a constant 

20 (isochronous) output video data stream while allowing bus cycles for loading of the data blocks. A 1 000 byte FIFO buffer 
208 simplifies the CPU and bus loading logic. 

If communications output chip 200 does not albw the disabling of an initial synchronization (sync) byte generation, 
then the value of the sync byte is programmed to the value of the first byte of each data block (and the data block pointer 
is incremented to the second byte). Byte alignment must also be managed with real data, since any padding bytes will 

2S con^upt the data stream if they are not part of the actual compressed video data. 

To achieve the constant, high speed serial data outputs required for the high quality levels of compressed video 
data, either a circular buffer or a plurality of large buffers (e.g. 202, 204. 206) must be used. This is necessary to albw 
sufficient time to fill an input buffer while outputting data from a previously filled buffer. Unless buffer packing is done 
earlier in the video data stream path, the end of video condition can result in a very small buffer that will be output before 

30 the next buffer transfer can complete resulting in a data underrun. This necessitates a minimum of three large, inde- 
pendent buffers. A circular buffer in dual mode memory (writable while reading) is also a suitable embodiment. 



J1. CONVERSION OF VIDEOIMAGES AND MOVIES FROM COMPRESSED MPEG-1, 1+, OR MPEG-2, DIGITAL 
DATA FORMAT INTO INDUSTRY STANDARD TELEVISIONS FORMATS (NTSC OR PAL) 

As described above, digital video data is moved from disk to buffer memory. Once enough data is in buffer memory, 
it is moved from memory to an interface adapter in a communk:atbns node 14. The interfaces used are the SCSI 20 
MB/sec., fast/wide interface or the SSA serial SCSI interface. The SCSI interface is expanded to handle 15 addresses 
and the SSA architecture supports up to 256. Other suitable interfaces include, but are not limited to, RS422, V.35. V.36, 
40 etc. 

As shown in Fig. 21 . video data from the interface is passed from a communbatbn node 1 4 across a communbations 
bus 210 to NTSC adapter 212 (see also Fig. 20) where the data is buffered. Adapter 212 pulls the data from a local 
buffer 214, where multiple blocks of data are stored to maximize the performance of the bus. The key goal of adapter 
212 is to maintain an isochronous flow of data from the memory 214 to MPEG chips 216, 218 and thus to NTSC chip 
220 and D/A 222, to insure that there are no interruptions in the delivery of video and/or audb. 

MPEG logic modules 216, 218 convert the digital (compressed) video data into component level video and audio. 
An NTSC encoder 220 converts the signal into NTSC baseband anabg signals. MPEG audio decoder 21 6 converts the 
digital audio into parallel digital data whbh is then passed through a Digital to Analog converter 222 and filtered to 
generate audio Left and Right outputs. 
so The goal in creating a solution to the speed matching and Isochronous delivery problem is an approach that not 

only maximizes the bandwbth delivery of the system but also imposes the fewest performance constraints. 

Typbally, applicatbn developers have used a bus structure, such as SSA and SCSI, for control and delivery of data 
between processors and mechanical storage devices such disk files, tape files, optical storage units, etc. Both of these 
buses contain attributes that make them suitable for high bandwidth delivery of video data, provided that means are 
ss taken to control the speed and isochronous delivery of video data. 

The SCSI bus albws for the bursting of data at 20 Mbytes/sec. which minimizes the amount of time that any one 
vbeo signal is being moved from buffer memory to a specific NTSC adapter. The ads9>ter card 212 contains a large 
buffer 21 4 with a perfomnance capability to burst data into memory from bus 21 0 at high peak rates and to remove data 
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from buffer 214 at much lower rates for delivery to NTSC decoder chips 216, 218. Buffer 214 is further segmented into 
smaller buffers and connected via software controls to act as multiple buffers connected in a circular manner. 

This allows the system to deliver varying block sizes of data to separate buffers and controls the sequence of playout. 
An advantage of this approach is that it frees the system software to deliver bloclcs of video data well in advance of any 
s requirement for the video data, and at very high delivery rates. This provides the media streamer 10 with the ability to 
manage many multiple video steams on a dynamic throughput requirement. When a processor in a communications 
node has time, it can cause delivery of several large blocks of data that will be played in sequence. Once this Is done, 
the processor is free to control other streams without an immediate need to deliver slow continuous Isochronous data 
to each port. 

10 To further improve the cost effectiveness of the decoder system, a small FIFO memory 224 is inserted between the 

larger decoder buffer 214 and MPEG decoders 216, 218. The FIFO memory 224 alk)ws controller 226 to move smaller 
blocks, typically 51 2 bytes of data, from buffer 214 to FIFO 224 whfch, in turn, converts the data into serial bit streams 
for delivery to MPEG decoders 21 6, 21 8. Both the audio and the video decoder chips 21 6, 21 8 can take their input from 
the same serial data stream, and internally separate and decode the data required. The transmission of data from the 

IS output of the FIFO memory 224 occurs in an isochronous manner, or substantially Isochronous manner, to ensure the 
delivery of an uninterrupted vkJeo presentation to a user or consumer of the video presentation. 



K. TRANSMISSION OF DIGITAL VIDEO TO SCSI DEVICES 



20 As shown in Fig. 22. compressed digital video data and command streams from buffer memory are converted by 

devk:e level software into SCSI commands and data streams, and are transmitted over SCSI bus 21 0 to a target adapter 
212 at SCSI II fast data rates. The data is then buffered and fed at the required content output rate to MPEG logic for 
decompression and conversbn to analog video and audk) data. Feedback is provided across SCSI bus 210 to pace the 
data flow and insure proper buffer management. 

2S The SCSI NTSC/PAL adapter 212 provides a high level interface to SCSI bus 210, supporting a subset of the 

standard SCSI protocol. The normal rTKXie of operatbn is to open the adapter 21 2. write data (video and audk>) streams 
to it and, ck>sing the adapter 21 2 only when completed. Adapter 21 2 pulls data as fast as necessary to keep its buffers 
full, with the communbatkMi nodes 1 4 and storage nodes 16 provkling bkxks of data, that are sized to optimize the bus 
data transfer and minimize bus overhead. 

30 System parameters can be ovenvritten via control packets using a Mode Select SCSI command If necessary. Vid- 

eo/Audio synchronization is internal to the adapter 212 and no external controls are required. En-ors are minimized, with 
automatb resynchronizatkxi and continued audks/vldeo output. 



.... 35. . 



K1. SCSI LEVEL COMMAND DESCRIPTION 



A mix of direct access device and sequential device commands are used as well as standard common commands 
to fit the functionality of the SCSI VKleo output adapter. As with all SCSI commands, a valid status byte is returned after 
every command, and the sense data area is loaded with the error cortditions if a check conditk>n is returned. The standard 
SCSI commands used include RESET. INQUIRY, REQUEST SENSE, MODE SELECT. MODE SENSE, READ, WRITE, 
40 RESERVE, RELEASE, TEST UNIT READY. 



Vkjeo Commands: 



The video control commands are user-level video output control commands, and are extensions to the standard 
4S commands listed above. They provide a simplified user level front end to the low level operating system or SCSI com- 
mands that directly interface to the SCSI vkJeo output adapter 212. The implementation of each command employs 
mk:rocode to emulate the necessary video devk^ functk)n and avoki video and audio anomalies caused by invalid 
control states. A single SCSI command; the SCSI START/STOP UNIT command, is used to translate vkieo control 
commands to the target SCSI video output adapter 21 2, with any necessary parameters moved along with the command. 
so This simplifies both the user application interface and the adapter card 212 mksrocode. The following commands are 
employed. 



Stop (SCSI START/STOP 1 - parameter = mode) 



The data input into the MPEG chip set (216, 218) is halted, the audio is muted, and the video is blanked. The 
parameter field selects the stop mode. The normal mode is for the buffer and position pointer to remain current, so that 
PLAY continues at the same locatkxi in the video stream. A second (end of movie or abort) mode is to set the buffer 
pointers to the start of the next buffer and release the current buffer. A third mode is also for end of movie conditions, 
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but the stop (mute and blank) is delayed until the data buffer runs empty. A fourth mode may be employed with certain 
MPEG decoder implementations to provide for a delayed stop with audio, but freeze frame for the last valid frame when 
the data runs out. In each of these cases, the video adapter 212 microcode determines the stopping point so that the 
video and audio output is halted on the proper boundary to allow a clean restart. 

5 

Pause (SCSI START/STOP 2 - no parameters) 

The data input into the MPEG chip set (216, 216) is halted and the audio is muted, but the video is not blanked. 
This causes the MPEG video chip set (216, 218) to hold a freeze frame of the last good frame. This is limited to avoid 
10 burn*in of the video tube. A Stop comnriand Is preferably issued by the control node 18 but the video output will auto- 
matically go to blank if no commands are received within 5 minutes. The adapter 212 microcode maintains the buffer 
positions and decoder states to albw for a smooth transition back to play 

Blank-Mute (SCSI START/STOP 3 - parameter = mode) 

IS 

This command blan ks the vkieo output without impacting the audk> output, mutes the audk> output without impacting 
the video, or tx>th. Both muting and blanking can be turned off with a single command using a Mode parameter, which 
allows a smoother transition and reduced command overhead. These are implemented on the video adapter 212 after 
decompression and conversion to analog, with hardware controls to ensure a positive, smooth transitk)n. 

20 

Slow Play (SCSI START/STOP 4 - parameter = rate) 

This command slows the data input rate into the MPEG chip set, (216, 218) causing it to intermittently freeze frame, 
simulating a slow play function on a VCR. TTie audio is muted to avoid digital error noise. The parameter field specifies 
2S a relative speed from 0 to 100. An alternative implementatkxi disables the decoder chip set (21 6, 21 8) error handling, 
and then modifies the data ckx:king speed into the decoder chip set to the desired playing speed. This is dependent on 
the flexibility of the video adapter's clock architecture. 

Play (SCSI START/STOP 5 - parameter = buffer) 

30 

This command starts the data feed process into the MPEG chip set (21 6, 21 8), enabling the audio and vkJeo outputs. 
A buffer selection number is passed to determine which buffer to begin the playing sequence from, and a zero value 
indicates that the current play buffer should be used (typical operation). A non-zero value is only accepted if the adapter 
212 is in STOPPED mode, if in PAUSED mode the buffer selection parameter Is ignored and playing is resumed using 

35.1 -the current buffer selection and position. ■ ^ .r ... . . ...f-: ,.r.vr-':.x ^.^...-.^ ^n^^r^.,..^. .^-^ 

When 'PLAYING', the controller 226 rotates through the buffers sequentially maintaining a steady stream of data 
into the MPEG chip set (216, 218). Data is read from the buffer at the appropriate rate into the MPEG bus starting at 
address zero until N bytes are read, then the controller 226 switches to the next buffer and continues reading data. The 
adapter bus and microcode provides sufficient bandwidth for both the SCSI Fast data transfer into the adapter buffers 

^ 214, and the steady k>ading of the data onto the output FIFO 224 that feeds the MPEG decompressbn chips (21 6, 21 8). 

Fast Fonward (SCSI START/STOP 6 - parameter = rate) 

This command is used to scan through data in a manner that emulates fast forward on a VCR. There are two modes 
^ of operatbn that are detenmined by the rate parameter. A rate of 0 means that it is a rapkJ fast forward where the video 
and audk) shoukJ be blanked and muted, the buffers flushed, and an implicit play is executed when data is received 
from a new position forward in the video stream. An integer value between 1 and 10 indicates the rate that the input 
stream is being forwarded. The video is 'sampled' by skipping over blocks of data to achieve the specified average data 
rate. The adapter 21 2 plays a portion of data at nearly the normal rate, jumps ahead, then plays the next portion to 
so emulate the fast forward actbn. 

Rewind (SCSI START/STOP 7 - parameter = buffer) 

This command is used to scan backwards through data in a manner that emulates rewind on a VCR. There are two 
ss modes of operation that are determined by the rate parameter. A rate of 0 means that it is a rapid rewind where the 
video and audio should be blanked and muted, the buffers flushed, and an implk:it play executed when data is received 
from a new position fonward in the vkieo stream. An integer value between 1 and 10 indk:ates the rate that the input 
stream is being rewound. The video is 'sampled' by skipping over bk>cks of data to achieve the specified average data 
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rate. The rewind data stream is built by assembling small blocks of data that are 'sampled' from progressively early 
positions in the video stream. The adapter card 212 smoothly handles the transitions and synchronization to play at the 
normal rate, skipping back to the next sampled portbn to emulate rewind scanning. 

s K2. BUFFER MANAGEMENT 

Digital video servers provide data to many concurrent output devices, but digital video data decompression and 
conversion requires a constant data stream. Data buffering techniques are used to take advantage of the SCSI data 
burst mode transmission, while still avoiding data underrun or buffer overrun, allowing media streamer 10 to transmit 

10 data to many streams with minimal Intervention. SCSI vkJeo adapter card 21 2 (Figs. 21 , 22) Includes a large buffer 21 4 
for video data to albw full utilization of the SCSI burst mode data transfer process. An exemplary configuration would 
be one buffer 214 of 768K, handled by local logic as a wrap-around circular buffer. Circular buffers are preferred to 
dynamically handle varying data block sizes, rather than fixed length buffers that are ineffk:ient in terms of both storage 
and management overhead when transferring digital video data. 

IS The video adapter card 21 2 microcode supports several buffer pointers, keeping the last top of data as well as the 

current length and top of data. This allows a retry to ovenvrite failed transmissbn, or a pointer to be posltk)ned to a byte 
position within the current buffer if necessary. The data bkx:k length is maintained exactly as transmitted (e.g., byte or 
word specific even if long word alignment is used by the intermediate logic) to insure valid data delivery to the decode 
chip set (216, 218). This approach minimizes the steady state operation overhead, while still allowing flexible control of 

20 the data buffers. 

K2.1. BUFFER SELECTION AND POSITION 

Assuming multiple sets of buffers are required, multiple pointers are available for all buffer related operations. For 
2S example, one set may be used to select the PLAY buffer and curent position within that buffer, and a second set to 
select the write buffer and a position within that buffer (typk^ally zero) for a data preload operation. A current length and 
maximum length value are maintained for each bkx:k of data received since variable length data blocks are also sup- 
ported. 

30 K2.2. AUTOMATIC MODE 

The buffer operation is managed by the video adapter's controller 226. placing the N bytes of data in the next 

available buffer space starting at address zero of that buffer. Controller 226 keeps track of the length of data in each 
buffer and if that data has been "played" or not. Whenever sufficient buffer space is free, the card accepts the next 
3S. WRITE command and DMA's the data into that buffer. If not enough buffer space is free to accept^the^full data^block*- 
(typically a Slow Play or Pause condition), the WRITE is not accepted and a buffer full return code Is returned. 

K2.3. MANUAL MODE 

40 A LOCATE command is used to select a 'current* write buffer and positbn within that buffer (typically zero) for each 

buffer access command (Write, Erase, etc.). The buffer positk>n is relative to the start of data for the last block of data 
that was successfully transmitted. This is done preferably for vkJeo stream transition management, with the automata 
mode reactivated as soon as possible to minimize command overhead in the system. 

4S K2.4. ERROR MANAGEMENT 

Digital video data transmission has different error management requirements than the random data access usage 
that SCSI is normally used for in data processing applications. Minor data loss is less critical than transmission inter- 
ruption, so the conventional retries and data validation schemes are rrxxJified or disabled. The normal SCSI error handling 
50 procedures are followed with the status byte being returned during the status phase at the completion of each command. 
The status byte Indicates either a GOOD (00) condition, a BUSY (8h) If the target SCSI chip 227 is unable to accept a 
command, or a CHECK CONDITION (02h) if an error has occurred. 

K2.5. ERROR RECOVERY 

ss 

The controller 226 of the SCSI video adapter 212 automatically generates a Request Sense command on a Check 
Conditkxi response to load the error and status informatkxi, and determines if a recovery procedure Is possible. The 
nomnal recovery procedure is to clear the error state, discard any corrupted data, and resume normal play as quickly 
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as possible. In a worst case, the adapter 212 may have to be reset and the data reloaded before the play can resume. 
Error conditions are logged and reported back to the host system with the next INQUIRY or REQUEST SENSE SCSI 
operation. 

s K2.6. AUTOMATIC RETRIES 

For buffer full or device busy conditions, retries are automated up to X number of retries, where X is dependent on 
the stream data rate. This is allowed only to the point in time that the next data buffer arrives. At that point, an error is 
logged if the condition Is unexpected (i.e., Buffer full but not PAUSED or in SLOW PLAY mode) and a device reset or 
10 clear may be necessary to recover and continue video play. 

Although described primarily in the context of delivering a video presentation to a user, it should be realized that 
bidirectional video adapters can be employed to receive a video presentation, to digitize the video presentation as a 
data representatbn thereof, and to transmit the data representation over the bus 210 to a communication node 14 for 
storage, via low latency switch 18. within a storage node or nodes 16. 17 as specified by the control node 18. 

15 

Claims 

1. A media streamer, comprising: 

20 at least one storage node for storing a digital representation of at least one video presentation, said at least 

one video presentation requiring a time T to present In Its entirety, and stored as a plurality of N data blocks, each 
data block comprising a T/N portion of said at least one video presentation, said at least one storage node comprising 
a first data buffer for buffering at least one of said N data bkx:ks; 

a plurality of communication nodes each having an Input port that is coupled via a circuit switch to an output 

2S of said first data buffer for sequentially receiving a plurality of said N data blocks therefrom, saki sequentially received 

N data blocks being associated with a same video presentation or with different vkleo presentatbns. each of said 
plurality of communk:atk>n nodes further having a plurality of output ports, individual ones of said plurality of output 
ports outputting a digital representatk>n of one video presentatton, indivklual ones of said plurality of communlcatran 
nodes further comprising a second data buffer for buffering at least one of said N data blocks prior to outputting 

30 saki at least one of said N data blocks; and 

at least one control node responsive to a first operating condition for causing transfer of one of said N data 
bkx^ks from saki first data buffer to an output port of a first communication node and also to an output port of a 
second communication node, said at least one control node being further responsive to a second operating condition 
for causing transfer of one of said N data blocks from said first data buffer to said second data buffer of one of said ^ 

) 35,., communication nodes,-and for causing transfer of^said one of- said <N data bkx:ks from saidsecond data buffento a'«>.»CT'.».v^u.v» 
plurality of said output ports of said one of sakJ communk^atkxi nodes. 

2. A media streamer as set forth in claim 1 and further including means for selectively retaining one of said N data 
blocks within said first data buffer if it Is predicted that said one of said N data blocks will be output from at least 

^ one of sakI communk;ations nodes within a predetermined perkxJ of time. 

3. A media streamer as set forth in claim 1 and further Including means for selectively retaining one of said N data 
blocks within said second data buffer if it is predicted that said one of said N data blocks will be output from at least 
one of said output ports of a communications node within a predetermined period of time. 

4S 

4. A media streamer as set forth In claim 2 or claim 3, wherein, for one of sakI N data bkx:ks that is not to be retained, 
sakJ media streamer Includes means for replacing sakl one of said N data blocks within said data buffer, said replac- 
ing means being responsive to a predicted demand for the associated video presentation and also to a kxation, 
within a corresponding data representation of said one of said N data blocks, for determining a priority of retaining 

SO said one of saki N data bkxks with respect to others of saki N data blocks stored within said data buffer. 

5. A media streamer as set forth In claim 4 wherein a higher priority is assigned to a data bk)ck that is kx:ated at or 
near a beginning of a data representation than is assigned to a data bkx:k that is located at or near an end of said 
data representation. 

ss 

6. A media streamer as set forth in claim 2 or claim 3 wherein, for one of said N data blocks that Is to be retained, saki 
media streamer includes means for replacing saki one of saki N data bkx:ks within said data buffer, saki replacing 
means being responsive to a next predicted time that the said one of saki N data bkxks Is required to be output 
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10 



from at least one of said communication nodes, and also to a number of output ports that are outputting a digital 
representation with which the said one of said N data blocks is associated. 

7. A media streamer as set forth in claim 1 wherein said at least one control node further includes means for synchro- 
nizing a first outputted data representation to a second outputted data representation such that said first data rep- 
resentation and said second data representation simultaneously output data from a same one of said N data blocks. 

8. Amediastreamerassetforth inclaim 1 wherein said first data buffer and sakJ second data buffer are of approximately 
equal size. 

9. A media streamer as set forth In claim 1 wherein sakJ first data buffer and saki second data buffer are components 
of a single data buffer that is distributed between sakJ at least one storage node and sakI plurality of communication 
nodes. 

IS 10. A data storage system comprising: 

a mass storage unit storing a data entity that is partitioned into a plurality N of temporally-ordered segments; 
a data buffer that is bidirectk>nally coupled to saki mass storage unit for storing up to M of said tempo- 
rally-ordered segments, wherein M is less than N, saki data buffer having an output for outputting stored ones of 
sakJ temporally-ordered segments; and 
20 a data buffer manager for scheduling transfers of indrvkiual ones of saki temporally-ordered segments between 

saki mass storage unit arxi sakJ data buffer, saki data buffer manager scheduling said transfers In accordance with 
at least a predk:ted time that an individual one of said temporally-ordered segments will be required to be output 
from saki data buffer. 
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FIG. 2. 
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